|
Bankruptcy Prediction
|
A company goes Bankrupt when the assets and shares taken by the company cross the liability. In most cases, bankruptcy models are based on financial indicators that describe the current condition or a certain area of financial health, such as profitability, indebtedness and so on. This research shall analyse the financial statements and market data of these companies and then try to apply several models to determine the bankruptcy. The goal is to find out how far back these models are able to predict that the companies would get into financial distress and which information about the financial ratios would enhance the prediction accuracy of the bankruptcy prediction model.
![]() |
|
![]() |
|---|---|---|
Inferences from Outcome
|
Additional Reads
|
Lets work!
|
# importing needed libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
from statsmodels.stats.outliers_influence import variance_inflation_factor as vif
from sklearn.impute import KNNImputer
from sklearn.model_selection import train_test_split
import statsmodels.api as sm
from sklearn import metrics
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE
from sklearn.feature_selection import RFECV
from matplotlib.colors import ListedColormap
from sklearn.model_selection import RandomizedSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import VotingClassifier
from scipy.stats import randint as sp_randint
import warnings
warnings.filterwarnings('ignore')
# display all columns of the dataframe
pd.options.display.max_columns = None
# display all rows of the dataframe
pd.options.display.max_rows = None
# return an output value upto 6 decimals
pd.options.display.float_format = '{:.6f}'.format
# settig default figure and font size
plt.rcParams['figure.figsize'] = (15,8)
plt.rcParams['font.size'] = 15
|
Let's import the dataset and start working
|
# reading csv file using pandas
df = pd.read_csv('data.csv')
# displaying the top 5 rows of the dataframe
df.head()
| Bankrupt? | ROA(C) before interest and depreciation before interest | ROA(A) before interest and % after tax | ROA(B) before interest and depreciation after tax | Operating Gross Margin | Realized Sales Gross Margin | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Continuous interest rate (after tax) | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Net Value Per Share (A) | Net Value Per Share (C) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Operating Profit Per Share (Yuan ¥) | Per Share Net profit before tax (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Debt ratio % | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Total Asset Turnover | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Current Liability to Liability | Current Liability to Equity | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Liability-Assets Flag | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Net Income Flag | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0.370594 | 0.424389 | 0.405750 | 0.601457 | 0.601457 | 0.998969 | 0.796887 | 0.808809 | 0.302646 | 0.780985 | 0.000126 | 0.000000 | 0.458143 | 0.000725 | 0.000000 | 0.147950 | 0.147950 | 0.147950 | 0.169141 | 0.311664 | 0.017560 | 0.095921 | 0.138736 | 0.022102 | 0.848195 | 0.688979 | 0.688979 | 0.217535 | 4980000000.000000 | 0.000327 | 0.263100 | 0.363725 | 0.002259 | 0.001208 | 0.629951 | 0.021266 | 0.207576 | 0.792424 | 0.005024 | 0.390284 | 0.006479 | 0.095885 | 0.137757 | 0.398036 | 0.086957 | 0.001814 | 0.003487 | 0.000182 | 0.000117 | 0.032903 | 0.034164 | 0.392913 | 0.037135 | 0.672775 | 0.166673 | 0.190643 | 0.004094 | 0.001997 | 0.000147 | 0.147308 | 0.334015 | 0.276920 | 0.001036 | 0.676269 | 0.721275 | 0.339077 | 0.025592 | 0.903225 | 0.002022 | 0.064856 | 701000000.000000 | 6550000000.000000 | 0.593831 | 458000000.000000 | 0.671568 | 0.424206 | 0.676269 | 0.339077 | 0.126549 | 0.637555 | 0.458609 | 0.520382 | 0.312905 | 0.118250 | 0 | 0.716845 | 0.009219 | 0.622879 | 0.601453 | 0.827890 | 0.290202 | 0.026601 | 0.564050 | 1 | 0.016469 |
| 1 | 1 | 0.464291 | 0.538214 | 0.516730 | 0.610235 | 0.610235 | 0.998946 | 0.797380 | 0.809301 | 0.303556 | 0.781506 | 0.000290 | 0.000000 | 0.461867 | 0.000647 | 0.000000 | 0.182251 | 0.182251 | 0.182251 | 0.208944 | 0.318137 | 0.021144 | 0.093722 | 0.169918 | 0.022080 | 0.848088 | 0.689693 | 0.689702 | 0.217620 | 6110000000.000000 | 0.000443 | 0.264516 | 0.376709 | 0.006016 | 0.004039 | 0.635172 | 0.012502 | 0.171176 | 0.828824 | 0.005059 | 0.376760 | 0.005835 | 0.093743 | 0.168962 | 0.397725 | 0.064468 | 0.001286 | 0.004917 | 9360000000.000000 | 719000000.000000 | 0.025484 | 0.006889 | 0.391590 | 0.012335 | 0.751111 | 0.127236 | 0.182419 | 0.014948 | 0.004136 | 0.001384 | 0.056963 | 0.341106 | 0.289642 | 0.005210 | 0.308589 | 0.731975 | 0.329740 | 0.023947 | 0.931065 | 0.002226 | 0.025516 | 0.000107 | 7700000000.000000 | 0.593916 | 2490000000.000000 | 0.671570 | 0.468828 | 0.308589 | 0.329740 | 0.120916 | 0.641100 | 0.459001 | 0.567101 | 0.314163 | 0.047775 | 0 | 0.795297 | 0.008323 | 0.623652 | 0.610237 | 0.839969 | 0.283846 | 0.264577 | 0.570175 | 1 | 0.020794 |
| 2 | 1 | 0.426071 | 0.499019 | 0.472295 | 0.601450 | 0.601364 | 0.998857 | 0.796403 | 0.808388 | 0.302035 | 0.780284 | 0.000236 | 25500000.000000 | 0.458521 | 0.000790 | 0.000000 | 0.177911 | 0.177911 | 0.193713 | 0.180581 | 0.307102 | 0.005944 | 0.092338 | 0.142803 | 0.022760 | 0.848094 | 0.689463 | 0.689470 | 0.217601 | 7280000000.000000 | 0.000396 | 0.264184 | 0.368913 | 0.011543 | 0.005348 | 0.629631 | 0.021248 | 0.207516 | 0.792484 | 0.005100 | 0.379093 | 0.006562 | 0.092318 | 0.148036 | 0.406580 | 0.014993 | 0.001495 | 0.004227 | 65000000.000000 | 2650000000.000000 | 0.013387 | 0.028997 | 0.381968 | 0.141016 | 0.829502 | 0.340201 | 0.602806 | 0.000991 | 0.006302 | 5340000000.000000 | 0.098162 | 0.336731 | 0.277456 | 0.013879 | 0.446027 | 0.742729 | 0.334777 | 0.003715 | 0.909903 | 0.002060 | 0.021387 | 0.001791 | 0.001023 | 0.594502 | 761000000.000000 | 0.671571 | 0.276179 | 0.446027 | 0.334777 | 0.117922 | 0.642765 | 0.459254 | 0.538491 | 0.314515 | 0.025346 | 0 | 0.774670 | 0.040003 | 0.623841 | 0.601449 | 0.836774 | 0.290189 | 0.026555 | 0.563706 | 1 | 0.016474 |
| 3 | 1 | 0.399844 | 0.451265 | 0.457733 | 0.583541 | 0.583541 | 0.998700 | 0.796967 | 0.808966 | 0.303350 | 0.781241 | 0.000108 | 0.000000 | 0.465705 | 0.000449 | 0.000000 | 0.154187 | 0.154187 | 0.154187 | 0.193722 | 0.321674 | 0.014368 | 0.077762 | 0.148603 | 0.022046 | 0.848005 | 0.689110 | 0.689110 | 0.217568 | 4880000000.000000 | 0.000382 | 0.263371 | 0.384077 | 0.004194 | 0.002896 | 0.630228 | 0.009572 | 0.151465 | 0.848535 | 0.005047 | 0.379743 | 0.005366 | 0.077727 | 0.147561 | 0.397925 | 0.089955 | 0.001966 | 0.003215 | 7130000000.000000 | 9150000000.000000 | 0.028065 | 0.015463 | 0.378497 | 0.021320 | 0.725754 | 0.161575 | 0.225815 | 0.018851 | 0.002961 | 0.001011 | 0.098715 | 0.348716 | 0.276580 | 0.003540 | 0.615848 | 0.729825 | 0.331509 | 0.022165 | 0.906902 | 0.001831 | 0.024161 | 8140000000.000000 | 6050000000.000000 | 0.593889 | 2030000000.000000 | 0.671519 | 0.559144 | 0.615848 | 0.331509 | 0.120760 | 0.579039 | 0.448518 | 0.604105 | 0.302382 | 0.067250 | 0 | 0.739555 | 0.003252 | 0.622929 | 0.583538 | 0.834697 | 0.281721 | 0.026697 | 0.564663 | 1 | 0.023982 |
| 4 | 1 | 0.465022 | 0.538432 | 0.522298 | 0.598783 | 0.598783 | 0.998973 | 0.797366 | 0.809304 | 0.303475 | 0.781550 | 7890000000.000000 | 0.000000 | 0.462746 | 0.000686 | 0.000000 | 0.167502 | 0.167502 | 0.167502 | 0.212537 | 0.319162 | 0.029690 | 0.096898 | 0.168412 | 0.022096 | 0.848258 | 0.689697 | 0.689697 | 0.217626 | 5510000000.000000 | 0.000439 | 0.265218 | 0.379690 | 0.006022 | 0.003727 | 0.636055 | 0.005150 | 0.106509 | 0.893491 | 0.005303 | 0.375025 | 0.006624 | 0.096927 | 0.167461 | 0.400079 | 0.175412 | 0.001449 | 0.004367 | 0.000163 | 0.000294 | 0.040161 | 0.058111 | 0.394371 | 0.023988 | 0.751822 | 0.260330 | 0.358380 | 0.014161 | 0.004275 | 0.000680 | 0.110195 | 0.344639 | 0.287913 | 0.004869 | 0.975007 | 0.732000 | 0.330726 | 0.000000 | 0.913850 | 0.002224 | 0.026385 | 6680000000.000000 | 5050000000.000000 | 0.593915 | 824000000.000000 | 0.671563 | 0.309555 | 0.975007 | 0.330726 | 0.110933 | 0.622374 | 0.454411 | 0.578469 | 0.311567 | 0.047725 | 0 | 0.795016 | 0.003878 | 0.623521 | 0.598782 | 0.839973 | 0.278514 | 0.024752 | 0.575617 | 1 | 0.035490 |
|
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
We will follow the steps mentioned below to analyse the dataset:- 1. Check dimensions of the dataframe in terms of rows and columns 2. Check the data types. Refer data definition to ensure your data types are correct. If not, make necessary changes 3. Study summary statistics 4. Univariate Analysis: Visualise target and Study distributions of independent variables 5. Detect outliers 6. Check for missing and duplicate values 7. Bivariate Analyis: Study correlation 8. Analyze relationship between target variable and independent variables |
# checking the shape of the df
df.shape
(6819, 96)
There are 6819 rows and 96 columns in the dataset
# making a copy of the df
df1 = df.copy()
# checking data types
df1.dtypes
Bankrupt? int64 ROA(C) before interest and depreciation before interest float64 ROA(A) before interest and % after tax float64 ROA(B) before interest and depreciation after tax float64 Operating Gross Margin float64 Realized Sales Gross Margin float64 Operating Profit Rate float64 Pre-tax net Interest Rate float64 After-tax net Interest Rate float64 Non-industry income and expenditure/revenue float64 Continuous interest rate (after tax) float64 Operating Expense Rate float64 Research and development expense rate float64 Cash flow rate float64 Interest-bearing debt interest rate float64 Tax rate (A) float64 Net Value Per Share (B) float64 Net Value Per Share (A) float64 Net Value Per Share (C) float64 Persistent EPS in the Last Four Seasons float64 Cash Flow Per Share float64 Revenue Per Share (Yuan ¥) float64 Operating Profit Per Share (Yuan ¥) float64 Per Share Net profit before tax (Yuan ¥) float64 Realized Sales Gross Profit Growth Rate float64 Operating Profit Growth Rate float64 After-tax Net Profit Growth Rate float64 Regular Net Profit Growth Rate float64 Continuous Net Profit Growth Rate float64 Total Asset Growth Rate float64 Net Value Growth Rate float64 Total Asset Return Growth Rate Ratio float64 Cash Reinvestment % float64 Current Ratio float64 Quick Ratio float64 Interest Expense Ratio float64 Total debt/Total net worth float64 Debt ratio % float64 Net worth/Assets float64 Long-term fund suitability ratio (A) float64 Borrowing dependency float64 Contingent liabilities/Net worth float64 Operating profit/Paid-in capital float64 Net profit before tax/Paid-in capital float64 Inventory and accounts receivable/Net value float64 Total Asset Turnover float64 Accounts Receivable Turnover float64 Average Collection Days float64 Inventory Turnover Rate (times) float64 Fixed Assets Turnover Frequency float64 Net Worth Turnover Rate (times) float64 Revenue per person float64 Operating profit per person float64 Allocation rate per person float64 Working Capital to Total Assets float64 Quick Assets/Total Assets float64 Current Assets/Total Assets float64 Cash/Total Assets float64 Quick Assets/Current Liability float64 Cash/Current Liability float64 Current Liability to Assets float64 Operating Funds to Liability float64 Inventory/Working Capital float64 Inventory/Current Liability float64 Current Liabilities/Liability float64 Working Capital/Equity float64 Current Liabilities/Equity float64 Long-term Liability to Current Assets float64 Retained Earnings to Total Assets float64 Total income/Total expense float64 Total expense/Assets float64 Current Asset Turnover Rate float64 Quick Asset Turnover Rate float64 Working capitcal Turnover Rate float64 Cash Turnover Rate float64 Cash Flow to Sales float64 Fixed Assets to Assets float64 Current Liability to Liability float64 Current Liability to Equity float64 Equity to Long-term Liability float64 Cash Flow to Total Assets float64 Cash Flow to Liability float64 CFO to Assets float64 Cash Flow to Equity float64 Current Liability to Current Assets float64 Liability-Assets Flag int64 Net Income to Total Assets float64 Total assets to GNP price float64 No-credit Interval float64 Gross Profit to Sales float64 Net Income to Stockholder's Equity float64 Liability to Equity float64 Degree of Financial Leverage (DFL) float64 Interest Coverage Ratio (Interest expense to EBIT) float64 Net Income Flag int64 Equity to Liability float64 dtype: object
All features are of numeric data type(int or float)
|
Lets perform summary statistics on the dataset and remove insignificant variables
|
# getting summary stats using df.describe
df1.describe()
| Bankrupt? | ROA(C) before interest and depreciation before interest | ROA(A) before interest and % after tax | ROA(B) before interest and depreciation after tax | Operating Gross Margin | Realized Sales Gross Margin | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Continuous interest rate (after tax) | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Net Value Per Share (A) | Net Value Per Share (C) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Operating Profit Per Share (Yuan ¥) | Per Share Net profit before tax (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Debt ratio % | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Total Asset Turnover | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Current Liability to Liability | Current Liability to Equity | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Liability-Assets Flag | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Net Income Flag | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 | 6819.000000 |
| mean | 0.032263 | 0.505180 | 0.558625 | 0.553589 | 0.607948 | 0.607929 | 0.998755 | 0.797190 | 0.809084 | 0.303623 | 0.781381 | 1995347312.802792 | 1950427306.056799 | 0.467431 | 16448012.905942 | 0.115001 | 0.190661 | 0.190633 | 0.190672 | 0.228813 | 0.323482 | 1328640.602096 | 0.109091 | 0.184361 | 0.022408 | 0.847980 | 0.689146 | 0.689150 | 0.217639 | 5508096595.248731 | 1566212.055241 | 0.264248 | 0.379677 | 403284.954245 | 8376594.819685 | 0.630991 | 4416336.714259 | 0.113177 | 0.886823 | 0.008783 | 0.374654 | 0.005968 | 0.108977 | 0.182715 | 0.402459 | 0.141606 | 12789705.237554 | 9826220.861192 | 2149106056.607530 | 1008595981.817477 | 0.038595 | 2325854.266358 | 0.400671 | 11255785.321742 | 0.814125 | 0.400132 | 0.522273 | 0.124095 | 3592902.196830 | 37159994.147133 | 0.090673 | 0.353828 | 0.277395 | 55806804.525780 | 0.761599 | 0.735817 | 0.331410 | 54160038.135894 | 0.934733 | 0.002549 | 0.029184 | 1195855763.308841 | 2163735272.034319 | 0.594006 | 2471976967.444247 | 0.671531 | 1220120.501590 | 0.761599 | 0.331410 | 0.115645 | 0.649731 | 0.461849 | 0.593415 | 0.315582 | 0.031506 | 0.001173 | 0.807760 | 18629417.811836 | 0.623915 | 0.607946 | 0.840402 | 0.280365 | 0.027541 | 0.565358 | 1.000000 | 0.047578 |
| std | 0.176710 | 0.060686 | 0.065620 | 0.061595 | 0.016934 | 0.016916 | 0.013010 | 0.012869 | 0.013601 | 0.011163 | 0.012679 | 3237683890.522487 | 2598291553.998342 | 0.017036 | 108275033.532823 | 0.138667 | 0.033390 | 0.033474 | 0.033480 | 0.033263 | 0.017611 | 51707089.767907 | 0.027942 | 0.033180 | 0.012079 | 0.010752 | 0.013853 | 0.013910 | 0.010063 | 2897717771.169734 | 114159389.518336 | 0.009634 | 0.020737 | 33302155.825480 | 244684748.446872 | 0.011238 | 168406905.281511 | 0.053920 | 0.053920 | 0.028153 | 0.016286 | 0.012188 | 0.027782 | 0.030785 | 0.013324 | 0.101145 | 278259836.984053 | 256358895.705332 | 3247967014.047904 | 2477557316.920172 | 0.036680 | 136632654.389936 | 0.032720 | 294506294.116772 | 0.059054 | 0.201998 | 0.218112 | 0.139251 | 171620908.606822 | 510350903.162733 | 0.050290 | 0.035147 | 0.010469 | 582051554.619420 | 0.206677 | 0.011678 | 0.013488 | 570270621.959227 | 0.025564 | 0.012093 | 0.027149 | 2821161238.262457 | 3374944402.166119 | 0.008959 | 2938623226.678810 | 0.009341 | 100754158.713168 | 0.206677 | 0.013488 | 0.019529 | 0.047372 | 0.029943 | 0.058561 | 0.012961 | 0.030845 | 0.034234 | 0.040332 | 376450059.745829 | 0.012290 | 0.016934 | 0.014523 | 0.014463 | 0.015668 | 0.013214 | 0.000000 | 0.050014 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 |
| 25% | 0.000000 | 0.476527 | 0.535543 | 0.527277 | 0.600445 | 0.600434 | 0.998969 | 0.797386 | 0.809312 | 0.303466 | 0.781567 | 0.000157 | 0.000128 | 0.461558 | 0.000203 | 0.000000 | 0.173613 | 0.173613 | 0.173676 | 0.214711 | 0.317748 | 0.015631 | 0.096083 | 0.170370 | 0.022065 | 0.847984 | 0.689270 | 0.689270 | 0.217580 | 4860000000.000000 | 0.000441 | 0.263759 | 0.374749 | 0.007555 | 0.004726 | 0.630612 | 0.003007 | 0.072891 | 0.851196 | 0.005244 | 0.370168 | 0.005366 | 0.096105 | 0.169376 | 0.397403 | 0.076462 | 0.000710 | 0.004387 | 0.000173 | 0.000233 | 0.021774 | 0.010433 | 0.392438 | 0.004121 | 0.774309 | 0.241973 | 0.352845 | 0.033543 | 0.005240 | 0.001973 | 0.053301 | 0.341023 | 0.277034 | 0.003163 | 0.626981 | 0.733612 | 0.328096 | 0.000000 | 0.931097 | 0.002236 | 0.014567 | 0.000146 | 0.000142 | 0.593934 | 0.000274 | 0.671565 | 0.085360 | 0.626981 | 0.328096 | 0.110933 | 0.633265 | 0.457116 | 0.565987 | 0.312995 | 0.018034 | 0.000000 | 0.796750 | 0.000904 | 0.623636 | 0.600443 | 0.840115 | 0.276944 | 0.026791 | 0.565158 | 1.000000 | 0.024477 |
| 50% | 0.000000 | 0.502706 | 0.559802 | 0.552278 | 0.605997 | 0.605976 | 0.999022 | 0.797464 | 0.809375 | 0.303525 | 0.781635 | 0.000278 | 509000000.000000 | 0.465080 | 0.000321 | 0.073489 | 0.184400 | 0.184400 | 0.184400 | 0.224544 | 0.322487 | 0.027376 | 0.104226 | 0.179709 | 0.022102 | 0.848044 | 0.689439 | 0.689439 | 0.217598 | 6400000000.000000 | 0.000462 | 0.264050 | 0.380425 | 0.010587 | 0.007412 | 0.630698 | 0.005546 | 0.111407 | 0.888593 | 0.005665 | 0.372624 | 0.005366 | 0.104133 | 0.178456 | 0.400131 | 0.118441 | 0.000968 | 0.006573 | 0.000765 | 0.000593 | 0.029516 | 0.018616 | 0.395898 | 0.007844 | 0.810275 | 0.386451 | 0.514830 | 0.074887 | 0.007909 | 0.004904 | 0.082705 | 0.348597 | 0.277178 | 0.006497 | 0.806881 | 0.736013 | 0.329685 | 0.001975 | 0.937672 | 0.002336 | 0.022674 | 0.000199 | 0.000225 | 0.593963 | 1080000000.000000 | 0.671574 | 0.196881 | 0.806881 | 0.329685 | 0.112340 | 0.645366 | 0.459750 | 0.593266 | 0.314953 | 0.027597 | 0.000000 | 0.810619 | 0.002085 | 0.623879 | 0.605998 | 0.841179 | 0.278778 | 0.026808 | 0.565252 | 1.000000 | 0.033798 |
| 75% | 0.000000 | 0.535563 | 0.589157 | 0.584105 | 0.613914 | 0.613842 | 0.999095 | 0.797579 | 0.809469 | 0.303585 | 0.781735 | 4145000000.000000 | 3450000000.000000 | 0.471004 | 0.000533 | 0.205841 | 0.199570 | 0.199570 | 0.199612 | 0.238820 | 0.328623 | 0.046357 | 0.116155 | 0.193493 | 0.022153 | 0.848123 | 0.689647 | 0.689647 | 0.217622 | 7390000000.000000 | 0.000499 | 0.264388 | 0.386731 | 0.016270 | 0.012249 | 0.631125 | 0.009273 | 0.148804 | 0.927109 | 0.006847 | 0.376271 | 0.005764 | 0.115927 | 0.191607 | 0.404551 | 0.176912 | 0.001455 | 0.008973 | 4620000000.000000 | 0.003652 | 0.042903 | 0.035855 | 0.401851 | 0.015020 | 0.850383 | 0.540594 | 0.689051 | 0.161073 | 0.012951 | 0.012806 | 0.119523 | 0.360915 | 0.277429 | 0.011147 | 0.942027 | 0.738560 | 0.332322 | 0.009006 | 0.944811 | 0.002492 | 0.035930 | 0.000453 | 4900000000.000000 | 0.594002 | 4510000000.000000 | 0.671587 | 0.372200 | 0.942027 | 0.332322 | 0.117106 | 0.663062 | 0.464236 | 0.624769 | 0.317707 | 0.038375 | 0.000000 | 0.826455 | 0.005270 | 0.624168 | 0.613913 | 0.842357 | 0.281449 | 0.026913 | 0.565725 | 1.000000 | 0.052838 |
| max | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 9990000000.000000 | 9980000000.000000 | 1.000000 | 990000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 3020000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 9990000000.000000 | 9330000000.000000 | 1.000000 | 1.000000 | 2750000000.000000 | 9230000000.000000 | 1.000000 | 9940000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 9740000000.000000 | 9730000000.000000 | 9990000000.000000 | 9990000000.000000 | 1.000000 | 8810000000.000000 | 1.000000 | 9570000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 8820000000.000000 | 9650000000.000000 | 1.000000 | 1.000000 | 1.000000 | 9910000000.000000 | 1.000000 | 1.000000 | 1.000000 | 9540000000.000000 | 1.000000 | 1.000000 | 1.000000 | 10000000000.000000 | 10000000000.000000 | 1.000000 | 10000000000.000000 | 1.000000 | 8320000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 9820000000.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
|
The above output illustrates the summary statistics of all the numeric variables like the mean, median(50%), minimum, and maximum values, along with the standard deviation.
If we observe the count of all the variables, all of them are equal. So we can say that there are probably no missing values in these variables. Also, standard deviation of Net Income Flag is 0, which means that this feature is insignificant for further analysis. |
# removing Net Income Flag because it has std of 0 and is insignificant to further analysis
df1 = df1.drop(' Net Income Flag',axis = 1)
|
Lets analyise the target variable "Bankrupt?"
|
df1['Bankrupt?'].value_counts()
0 6599 1 220 Name: Bankrupt?, dtype: int64
df1['Bankrupt?'].value_counts(normalize = True) * 100
0 96.773720 1 3.226280 Name: Bankrupt?, dtype: float64
plt.pie(df['Bankrupt?'].value_counts(),colors=['#5696b8','#355d73'],autopct='%.2f%%',labels = ['Healthy','Bankrupt'],
radius = 1.25)
plt.title('Number of Healthy Vs Bankrupt firms',fontsize=20)
plt.show()
|
There are way more Healthy firms than Bankrupt firms in the dataset
|
|
Lets analyse distribution of each variable in the dataset using the measure kurtosis and skewness. Then we will plot distribution plots for every variable.
|
# identifying leptokurtic features
kurt = df1.kurt()
len(kurt[kurt > 2])
85
# identifying highly skewed features
skew = df1.skew()
len(skew[abs(skew) >= 1])
82
# plotting distribution plots
fig, ax = plt.subplots(nrows = 10, ncols= 10, figsize=(50,45))
for variable, subplot in zip(df1.columns[1:], ax.flatten()):
z = sns.distplot(df1[variable], ax=subplot)
plt.tight_layout(pad=4.0)
plt.show()
|
As seen in our kurtosis and skewness anaylsis as well, most features are skewed and leptokurtic, so median will be a better measure of central tendency
|
|
Lets analyse outliers for each variable in the dataset using Boxplots.
|
# plotting boxplots for all variables using seaborn
plt.figure(figsize = (25,20))
ax =sns.boxplot(data = df1, orient="h",palette="Blues_d")
ax.set_title('Outlier Analysis using Boxplots', fontsize = 20)
ax.set(xscale="log")
plt.show()
|
There are alot of outliers in most of the features, removing these outliers might improve the performance of our model(s)
|
# checking for columns with null values
[i for i in df1 if df1[i].isnull().sum() > 0]
[]
#checking for duplicated values
df.duplicated().sum()
0
|
There are no null or duplicate values in the dataset
|
|
Correlation is a statistic that measures the degree to which two variables move in relation to each other.
In order to compute the correlation matrix, we perform the following: 1. Call the corr() function which will return the correlation matrix of numeric variables 2. Pass the correlation matrix to the heatmap() function of the seaborn library to plot the heatmap of the correlation matrix 3. Extract list of highly correlated pairs (corr > 0.7) |
# calling cor() function to get correlation matrix
corr = df1.drop(['Bankrupt?'],axis = 1).corr()
corr
| ROA(C) before interest and depreciation before interest | ROA(A) before interest and % after tax | ROA(B) before interest and depreciation after tax | Operating Gross Margin | Realized Sales Gross Margin | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Continuous interest rate (after tax) | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Net Value Per Share (A) | Net Value Per Share (C) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Operating Profit Per Share (Yuan ¥) | Per Share Net profit before tax (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Debt ratio % | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Total Asset Turnover | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Current Liability to Liability | Current Liability to Equity | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ROA(C) before interest and depreciation before interest | 1.000000 | 0.940124 | 0.986849 | 0.334719 | 0.332755 | 0.035725 | 0.053419 | 0.049222 | 0.020501 | 0.051328 | 0.066869 | 0.106461 | 0.323482 | 0.048882 | 0.250761 | 0.505580 | 0.505407 | 0.505281 | 0.775006 | 0.379839 | -0.015932 | 0.687201 | 0.750564 | 0.000591 | 0.036511 | 0.115083 | 0.115040 | 0.025234 | 0.019635 | -0.021930 | 0.079906 | 0.296158 | 0.013196 | -0.026336 | 0.003988 | -0.022208 | -0.261427 | 0.261427 | 0.002967 | -0.161671 | -0.035729 | 0.685028 | 0.753339 | -0.109888 | 0.210622 | -0.033947 | 0.007019 | -0.062660 | -0.065919 | 0.022896 | -0.014834 | 0.301996 | -0.012543 | 0.259680 | 0.181993 | 0.098820 | 0.235314 | -0.010530 | -0.046009 | -0.210256 | 0.388151 | -0.004447 | 0.013330 | 0.052783 | 0.103819 | -0.142734 | 0.021508 | 0.650217 | 0.023450 | -0.296019 | 0.005716 | -0.027280 | 0.001824 | -0.029477 | 0.011759 | -0.009192 | 0.052783 | -0.142734 | -0.086535 | 0.262454 | 0.159699 | 0.504311 | 0.129002 | -0.160725 | 0.887670 | -0.071725 | 0.008135 | 0.334721 | 0.274287 | -0.143629 | -0.016575 | 0.010573 | 0.052416 |
| ROA(A) before interest and % after tax | 0.940124 | 1.000000 | 0.955741 | 0.326969 | 0.324956 | 0.032053 | 0.053518 | 0.049474 | 0.029649 | 0.049909 | 0.075727 | 0.084334 | 0.288440 | 0.050362 | 0.225897 | 0.531799 | 0.531790 | 0.531821 | 0.764828 | 0.326239 | -0.011829 | 0.654253 | 0.752578 | 0.003277 | 0.042208 | 0.125384 | 0.125872 | 0.024887 | 0.026977 | -0.063970 | 0.081982 | 0.263615 | 0.014102 | -0.018412 | 0.005440 | -0.010323 | -0.259972 | 0.259972 | 0.020707 | -0.161868 | -0.036183 | 0.651581 | 0.758234 | -0.078585 | 0.223528 | -0.031262 | 0.009041 | -0.054496 | -0.136964 | 0.036925 | -0.014888 | 0.324942 | -0.006035 | 0.303532 | 0.202017 | 0.157005 | 0.217918 | -0.009612 | -0.037468 | -0.190501 | 0.351107 | -0.000004 | 0.004864 | 0.080401 | 0.120403 | -0.133816 | 0.022241 | 0.718013 | 0.028873 | -0.357147 | -0.000869 | -0.025143 | 0.004491 | -0.025817 | 0.012198 | -0.005860 | 0.080401 | -0.133816 | -0.103015 | 0.263591 | 0.157065 | 0.443017 | 0.112929 | -0.195673 | 0.961552 | -0.098900 | 0.011463 | 0.326971 | 0.291744 | -0.141039 | -0.011515 | 0.013372 | 0.057887 |
| ROA(B) before interest and depreciation after tax | 0.986849 | 0.955741 | 1.000000 | 0.333749 | 0.331755 | 0.035212 | 0.053726 | 0.049952 | 0.022366 | 0.052261 | 0.065602 | 0.102147 | 0.323040 | 0.045839 | 0.197344 | 0.502052 | 0.502000 | 0.501907 | 0.764597 | 0.366216 | -0.014359 | 0.659834 | 0.722940 | 0.002142 | 0.036144 | 0.117130 | 0.117042 | 0.024414 | 0.022104 | -0.026127 | 0.079972 | 0.292008 | 0.012975 | -0.024232 | 0.005187 | -0.021161 | -0.264734 | 0.264734 | 0.003869 | -0.158618 | -0.034177 | 0.657274 | 0.726003 | -0.109501 | 0.194810 | -0.033768 | 0.009921 | -0.053605 | -0.061046 | 0.012763 | -0.014545 | 0.304522 | -0.012770 | 0.260151 | 0.166311 | 0.094083 | 0.227144 | -0.010014 | -0.041296 | -0.217186 | 0.387893 | -0.001616 | 0.007302 | 0.046694 | 0.101962 | -0.142879 | 0.018300 | 0.673738 | 0.024436 | -0.322223 | -0.002611 | -0.029928 | 0.002488 | -0.030410 | 0.011977 | -0.008364 | 0.046694 | -0.142879 | -0.083190 | 0.258428 | 0.157022 | 0.497042 | 0.123622 | -0.162572 | 0.912040 | -0.089088 | 0.007523 | 0.333750 | 0.280617 | -0.142838 | -0.014663 | 0.011473 | 0.056430 |
| Operating Gross Margin | 0.334719 | 0.326969 | 0.333749 | 1.000000 | 0.999518 | 0.005745 | 0.032493 | 0.027175 | 0.051438 | 0.029430 | -0.206353 | -0.016976 | 0.341188 | 0.017198 | 0.067970 | 0.144661 | 0.145031 | 0.145057 | 0.256722 | 0.163192 | 0.117045 | 0.267944 | 0.247789 | 0.014172 | 0.022867 | 0.054639 | 0.053430 | 0.009121 | 0.016013 | -0.017448 | 0.026545 | 0.122676 | 0.024945 | 0.001379 | -0.002366 | -0.022360 | -0.245460 | 0.245460 | 0.006020 | -0.085733 | -0.022258 | 0.267411 | 0.248104 | -0.086720 | -0.099661 | 0.082342 | 0.022530 | 0.047665 | 0.001239 | -0.136157 | 0.019022 | 0.224976 | -0.006953 | 0.246304 | 0.152850 | 0.094782 | 0.241946 | -0.003206 | -0.030901 | -0.198027 | 0.246834 | -0.035025 | 0.035218 | 0.063547 | 0.067970 | -0.080422 | 0.000522 | 0.164579 | 0.043608 | 0.225479 | -0.121275 | -0.129715 | 0.020451 | -0.071579 | -0.041559 | 0.003507 | 0.063547 | -0.080422 | -0.068810 | 0.098097 | 0.114138 | 0.226990 | 0.030672 | -0.132650 | 0.300143 | 0.022672 | 0.004205 | 1.000000 | 0.075304 | -0.085434 | -0.011806 | -0.001167 | 0.120029 |
| Realized Sales Gross Margin | 0.332755 | 0.324956 | 0.331755 | 0.999518 | 1.000000 | 0.005610 | 0.032232 | 0.026851 | 0.051242 | 0.029166 | -0.206439 | -0.017391 | 0.341433 | 0.017121 | 0.067708 | 0.142887 | 0.143262 | 0.143288 | 0.254753 | 0.163163 | 0.117196 | 0.267021 | 0.246004 | 0.014188 | 0.022778 | 0.054470 | 0.053259 | 0.009117 | 0.016583 | -0.017451 | 0.026463 | 0.122787 | 0.024984 | 0.001418 | -0.002509 | -0.022354 | -0.245606 | 0.245606 | 0.006189 | -0.085598 | -0.022239 | 0.266483 | 0.246256 | -0.086762 | -0.100141 | 0.082479 | 0.022596 | 0.047079 | 0.001289 | -0.136335 | 0.019060 | 0.224458 | -0.007262 | 0.246221 | 0.152805 | 0.094838 | 0.242353 | -0.003186 | -0.031045 | -0.197842 | 0.246781 | -0.035085 | 0.035478 | 0.064657 | 0.067921 | -0.080350 | 0.000178 | 0.163013 | 0.043610 | 0.226170 | -0.121320 | -0.129747 | 0.020536 | -0.071321 | -0.041604 | 0.003524 | 0.064657 | -0.080350 | -0.068763 | 0.098056 | 0.114060 | 0.226912 | 0.030676 | -0.132607 | 0.298155 | 0.022750 | 0.004038 | 0.999518 | 0.074891 | -0.085407 | -0.011268 | -0.001158 | 0.120196 |
| Operating Profit Rate | 0.035725 | 0.032053 | 0.035212 | 0.005745 | 0.005610 | 1.000000 | 0.916448 | 0.862191 | -0.592006 | 0.915544 | 0.013246 | 0.016387 | 0.023051 | 0.002784 | 0.019936 | 0.019257 | 0.019218 | 0.019240 | 0.020420 | 0.014244 | -0.044460 | 0.022397 | 0.020219 | 0.000831 | 0.004952 | 0.011328 | 0.011227 | 0.001318 | 0.034465 | -0.000207 | 0.003677 | 0.014955 | 0.000833 | 0.000323 | 0.001156 | -0.001507 | 0.010397 | -0.010397 | -0.000833 | 0.001092 | 0.000247 | 0.022160 | 0.020015 | 0.011993 | 0.029456 | -0.023171 | 0.001001 | 0.009576 | 0.005232 | 0.016064 | -0.027450 | 0.018248 | 0.000726 | 0.025599 | 0.026100 | 0.033821 | 0.010465 | 0.000365 | 0.000301 | 0.011340 | 0.020308 | -0.001026 | -0.001748 | 0.020520 | 0.010600 | 0.001860 | 0.001967 | 0.021280 | 0.002047 | 0.005401 | 0.008117 | 0.012696 | -0.229568 | 0.016485 | -0.084747 | 0.000106 | 0.020520 | 0.001860 | -0.000654 | 0.020918 | 0.004669 | 0.026682 | 0.014088 | -0.079679 | 0.028482 | -0.003338 | 0.000199 | 0.005746 | 0.006216 | 0.001541 | 0.000935 | 0.000393 | -0.017071 |
| Pre-tax net Interest Rate | 0.053419 | 0.053518 | 0.053726 | 0.032493 | 0.032232 | 0.916448 | 1.000000 | 0.986379 | -0.220045 | 0.993617 | 0.014247 | 0.016836 | 0.024950 | 0.004031 | 0.023003 | 0.033034 | 0.033015 | 0.033035 | 0.033726 | 0.017617 | 0.004931 | 0.026314 | 0.034046 | 0.001246 | 0.003909 | 0.035150 | 0.034914 | 0.003013 | 0.037633 | -0.000998 | 0.005004 | 0.017801 | 0.003164 | -0.017376 | 0.001630 | -0.001964 | -0.003906 | 0.003906 | -0.001907 | -0.004654 | 0.002222 | 0.026091 | 0.033900 | 0.009042 | 0.029667 | 0.089576 | 0.002198 | 0.005560 | 0.003581 | 0.015513 | -0.144956 | 0.020271 | 0.000685 | 0.036347 | 0.030819 | 0.037156 | 0.017136 | 0.000296 | -0.001404 | 0.001632 | 0.022855 | 0.010231 | -0.017489 | 0.019009 | 0.014933 | -0.002202 | 0.002062 | 0.036236 | 0.003322 | -0.004525 | 0.008065 | 0.012206 | 0.090689 | 0.015581 | 0.233675 | -0.000047 | 0.019009 | -0.002202 | -0.007929 | 0.041845 | 0.011517 | 0.031813 | 0.026245 | -0.138584 | 0.048587 | -0.004243 | -0.000075 | 0.032494 | 0.011343 | -0.004043 | 0.000855 | 0.000984 | -0.014559 |
| After-tax net Interest Rate | 0.049222 | 0.049474 | 0.049952 | 0.027175 | 0.026851 | 0.862191 | 0.986379 | 1.000000 | -0.115211 | 0.984452 | 0.013982 | 0.016521 | 0.022813 | 0.003824 | 0.021164 | 0.031369 | 0.031347 | 0.031367 | 0.030768 | 0.016140 | 0.005594 | 0.024137 | 0.030621 | 0.001226 | 0.002962 | 0.031223 | 0.030964 | 0.002565 | 0.037066 | -0.000858 | 0.004379 | 0.016692 | 0.002685 | -0.015369 | 0.001612 | -0.001379 | -0.006174 | 0.006174 | -0.001227 | -0.004395 | 0.002117 | 0.023942 | 0.030568 | 0.008993 | 0.029504 | 0.079920 | 0.002032 | 0.005957 | 0.004241 | 0.015736 | -0.146671 | 0.018334 | 0.000702 | 0.040405 | 0.031453 | 0.037836 | 0.017118 | 0.000343 | -0.000827 | -0.002805 | 0.020877 | 0.009501 | -0.015766 | 0.013732 | 0.016788 | -0.003196 | 0.002061 | 0.034573 | 0.002985 | -0.002803 | 0.008174 | 0.012368 | 0.244911 | 0.015792 | 0.379952 | -0.000003 | 0.013732 | -0.003196 | -0.006326 | 0.046314 | 0.012243 | 0.029454 | 0.030022 | -0.166453 | 0.045390 | -0.003786 | -0.001091 | 0.027176 | 0.010648 | -0.004390 | 0.000927 | 0.000957 | -0.010900 |
| Non-industry income and expenditure/revenue | 0.020501 | 0.029649 | 0.022366 | 0.051438 | 0.051242 | -0.592006 | -0.220045 | -0.115211 | 1.000000 | -0.230698 | -0.003597 | -0.006041 | -0.005943 | 0.001332 | -0.002270 | 0.019588 | 0.019644 | 0.019632 | 0.018148 | 0.000758 | 0.118316 | -0.001601 | 0.019279 | 0.000484 | -0.004200 | 0.043179 | 0.042951 | 0.002855 | -0.008224 | -0.001505 | 0.001115 | -0.000606 | 0.004344 | -0.035784 | 0.000465 | -0.000284 | -0.033214 | 0.033214 | -0.001809 | -0.012037 | 0.003873 | -0.001473 | 0.019483 | -0.011027 | -0.012057 | 0.236897 | 0.001987 | -0.012145 | -0.005540 | -0.007916 | -0.225032 | -0.003658 | -0.000390 | 0.010800 | -0.001557 | -0.007613 | 0.009002 | -0.000294 | -0.003560 | -0.024357 | -0.003476 | 0.023107 | -0.030963 | -0.011739 | 0.004236 | -0.008969 | -0.000641 | 0.021105 | 0.001701 | -0.022281 | -0.003545 | -0.006365 | 0.742290 | -0.008805 | 0.677230 | -0.000355 | -0.011739 | -0.008969 | -0.014377 | 0.033285 | 0.011813 | -0.000973 | 0.018515 | -0.084875 | 0.028423 | -0.000408 | -0.000637 | 0.051437 | 0.007693 | -0.011899 | -0.000556 | 0.001024 | 0.012293 |
| Continuous interest rate (after tax) | 0.051328 | 0.049909 | 0.052261 | 0.029430 | 0.029166 | 0.915544 | 0.993617 | 0.984452 | -0.230698 | 1.000000 | 0.013168 | 0.015728 | 0.027730 | 0.003654 | 0.020407 | 0.030839 | 0.030835 | 0.030840 | 0.032051 | 0.016343 | 0.051607 | 0.024516 | 0.030487 | 0.001207 | 0.002643 | 0.016584 | 0.016415 | 0.001842 | 0.034962 | -0.000354 | 0.004203 | 0.017400 | 0.002887 | -0.016857 | 0.001479 | 0.000075 | -0.001192 | 0.001192 | -0.000187 | -0.002887 | 0.002166 | 0.024295 | 0.030334 | 0.008243 | 0.027711 | 0.100604 | 0.002168 | 0.006203 | 0.003439 | 0.014559 | -0.064693 | 0.018580 | 0.000652 | 0.035148 | 0.028547 | 0.034865 | 0.016445 | 0.000300 | -0.001112 | 0.000159 | 0.024520 | 0.011909 | -0.005399 | 0.014602 | 0.014826 | -0.002689 | 0.001882 | 0.034777 | 0.003094 | -0.004472 | 0.007518 | 0.011440 | 0.110703 | 0.014596 | 0.254886 | -0.000016 | 0.014602 | -0.002689 | -0.003093 | 0.042328 | 0.014059 | 0.030653 | 0.027140 | -0.140264 | 0.045600 | -0.004623 | -0.000556 | 0.029431 | 0.011191 | -0.002996 | 0.000774 | 0.000798 | -0.011299 |
| Operating Expense Rate | 0.066869 | 0.075727 | 0.065602 | -0.206353 | -0.206439 | 0.013246 | 0.014247 | 0.013982 | -0.003597 | 0.013168 | 1.000000 | -0.060386 | -0.020147 | -0.006011 | 0.060683 | 0.090519 | 0.091263 | 0.091197 | 0.080969 | 0.007253 | -0.015838 | 0.071799 | 0.081428 | -0.008170 | 0.013374 | 0.007176 | 0.009511 | -0.006644 | 0.014168 | -0.008456 | -0.003863 | -0.003016 | -0.007464 | 0.017687 | 0.024446 | -0.016164 | 0.143833 | -0.143833 | 0.008990 | 0.023977 | 0.010618 | 0.071972 | 0.082923 | 0.079747 | 0.195063 | -0.028331 | -0.007935 | -0.129214 | -0.055160 | 0.165135 | -0.010492 | 0.126869 | -0.009231 | -0.076724 | -0.004686 | 0.025720 | -0.110605 | -0.012904 | 0.024258 | 0.135256 | -0.042396 | -0.008018 | -0.011448 | -0.013653 | 0.007324 | 0.035899 | 0.001729 | 0.083315 | -0.001955 | -0.249426 | 0.170776 | 0.153936 | -0.003331 | 0.040730 | 0.003082 | -0.007464 | -0.013653 | 0.035899 | 0.024837 | 0.007630 | -0.006762 | -0.005426 | 0.014722 | 0.015511 | 0.071365 | -0.025524 | 0.006497 | -0.206354 | 0.029733 | 0.034809 | 0.013577 | 0.006232 | -0.120763 |
| Research and development expense rate | 0.106461 | 0.084334 | 0.102147 | -0.016976 | -0.017391 | 0.016387 | 0.016836 | 0.016521 | -0.006041 | 0.015728 | -0.060386 | 1.000000 | 0.030918 | 0.000656 | -0.019201 | 0.088822 | 0.087500 | 0.087063 | 0.076486 | 0.052162 | -0.019291 | 0.068738 | 0.066085 | -0.011151 | 0.012166 | 0.019958 | 0.020703 | 0.007842 | 0.023189 | -0.010300 | 0.029752 | 0.042269 | -0.009092 | -0.025702 | -0.013572 | -0.019292 | -0.045162 | 0.045162 | -0.047014 | -0.045529 | -0.012449 | 0.067143 | 0.069545 | -0.047222 | 0.013498 | -0.034508 | -0.028471 | 0.001366 | 0.016872 | -0.025440 | -0.012780 | -0.043509 | -0.028073 | 0.011881 | 0.064811 | -0.022853 | 0.013524 | -0.015717 | -0.042133 | -0.046075 | 0.035521 | -0.015522 | -0.023691 | -0.007120 | 0.005879 | -0.037452 | 0.008636 | 0.079153 | -0.008301 | -0.049438 | -0.046460 | -0.034643 | -0.002909 | 0.070369 | 0.003677 | -0.009092 | -0.007120 | -0.037452 | -0.012524 | 0.008293 | -0.008771 | 0.073629 | 0.008972 | -0.065204 | 0.079169 | -0.020166 | -0.006838 | -0.016975 | 0.021490 | -0.035363 | -0.013945 | -0.012160 | -0.045244 |
| Cash flow rate | 0.323482 | 0.288440 | 0.323040 | 0.341188 | 0.341433 | 0.023051 | 0.024950 | 0.022813 | -0.005943 | 0.027730 | -0.020147 | 0.030918 | 1.000000 | 0.011986 | 0.049835 | 0.158471 | 0.158520 | 0.158255 | 0.197705 | 0.353883 | 0.201679 | 0.191974 | 0.177008 | -0.017070 | 0.003731 | 0.019071 | 0.018300 | 0.003902 | 0.055389 | -0.007420 | -0.002779 | 0.344660 | 0.002257 | -0.012833 | 0.000513 | 0.349639 | -0.285010 | 0.285010 | 0.038308 | -0.078763 | -0.009771 | 0.191723 | 0.186619 | -0.125484 | -0.057978 | 0.005904 | 0.006638 | -0.020141 | 0.052978 | -0.096663 | -0.009018 | 0.131304 | -0.001554 | 0.161520 | 0.031135 | -0.049246 | 0.228027 | -0.008364 | -0.023394 | -0.278218 | 0.880562 | -0.006801 | 0.019862 | -0.068729 | 0.022609 | -0.087567 | 0.024446 | 0.223252 | 0.026204 | -0.081517 | -0.007870 | -0.057385 | 0.044478 | -0.093887 | 0.016086 | -0.006933 | -0.068729 | -0.087567 | -0.040178 | 0.224786 | 0.364812 | 0.603305 | 0.097761 | -0.126473 | 0.281309 | -0.052766 | 0.013642 | 0.341186 | 0.057933 | -0.080773 | -0.006348 | 0.001262 | 0.331710 |
| Interest-bearing debt interest rate | 0.048882 | 0.050362 | 0.045839 | 0.017198 | 0.017121 | 0.002784 | 0.004031 | 0.003824 | 0.001332 | 0.003654 | -0.006011 | 0.000656 | 0.011986 | 1.000000 | 0.010080 | 0.050347 | 0.050345 | 0.050159 | 0.059506 | 0.015297 | -0.003904 | 0.018363 | 0.061431 | -0.003299 | 0.001783 | 0.010106 | 0.010021 | 0.000499 | -0.018700 | 0.013614 | 0.004281 | 0.000198 | -0.001840 | -0.005201 | -0.004360 | -0.003984 | -0.059911 | 0.059911 | 0.002245 | -0.001729 | -0.002792 | 0.016198 | 0.054972 | 0.035486 | 0.008551 | -0.006983 | 0.016496 | 0.007713 | -0.036213 | 0.046759 | -0.002586 | -0.004588 | 0.024254 | 0.045141 | 0.025246 | 0.013544 | 0.014468 | 0.055143 | -0.011063 | -0.041390 | 0.015720 | -0.002820 | 0.011240 | 0.025009 | 0.068273 | -0.018880 | 0.006090 | 0.023919 | -0.000202 | -0.010339 | -0.009759 | -0.026821 | 0.000607 | -0.019243 | 0.000173 | -0.001840 | 0.025009 | -0.018880 | 0.038229 | -0.012599 | -0.011341 | 0.011482 | -0.006995 | 0.000022 | 0.048735 | -0.007519 | 0.003175 | 0.017198 | 0.010950 | -0.003423 | -0.007301 | -0.000779 | 0.028945 |
| Tax rate (A) | 0.250761 | 0.225897 | 0.197344 | 0.067970 | 0.067708 | 0.019936 | 0.023003 | 0.021164 | -0.002270 | 0.020407 | 0.060683 | -0.019201 | 0.049835 | 0.010080 | 1.000000 | 0.129891 | 0.130495 | 0.130665 | 0.169345 | 0.068363 | -0.021313 | 0.197467 | 0.208763 | 0.001507 | 0.018784 | 0.047621 | 0.047701 | 0.016857 | 0.067886 | -0.011380 | 0.009976 | 0.065329 | -0.000010 | -0.024875 | 0.034656 | -0.010497 | -0.009724 | 0.009724 | -0.012952 | -0.053177 | -0.012504 | 0.197419 | 0.209898 | 0.027022 | 0.193636 | -0.019968 | -0.000443 | -0.050769 | -0.135210 | 0.115270 | -0.014119 | 0.067826 | 0.005542 | 0.056103 | 0.119155 | 0.084386 | -0.011758 | -0.009854 | -0.035265 | 0.038520 | 0.051844 | -0.008833 | 0.017662 | 0.064182 | 0.041038 | -0.015970 | 0.016004 | 0.212204 | -0.000944 | -0.081476 | 0.063944 | 0.046586 | -0.002915 | 0.044051 | 0.004227 | -0.010045 | 0.064182 | -0.015970 | -0.054426 | 0.030302 | 0.023895 | 0.103101 | 0.021563 | -0.053579 | 0.231210 | -0.023643 | 0.011488 | 0.067971 | 0.077920 | -0.030002 | -0.014962 | 0.030275 | -0.053148 |
| Net Value Per Share (B) | 0.505580 | 0.531799 | 0.502052 | 0.144661 | 0.142887 | 0.019257 | 0.033034 | 0.031369 | 0.019588 | 0.030839 | 0.090519 | 0.088822 | 0.158471 | 0.050347 | 0.129891 | 1.000000 | 0.999342 | 0.999179 | 0.755568 | 0.346904 | -0.008235 | 0.607623 | 0.726321 | -0.013744 | 0.016049 | 0.056818 | 0.056518 | 0.043865 | -0.018871 | -0.030175 | 0.060968 | 0.090651 | 0.010462 | -0.002909 | -0.008175 | 0.008546 | -0.249146 | 0.249146 | 0.052254 | -0.123991 | -0.028406 | 0.603024 | 0.706646 | -0.089396 | 0.082026 | -0.018647 | -0.002448 | -0.080684 | -0.080971 | -0.032829 | -0.017799 | 0.261330 | -0.009322 | 0.198620 | 0.115516 | 0.047254 | 0.185621 | -0.014827 | -0.033078 | -0.198546 | 0.195965 | -0.005685 | -0.015542 | 0.044689 | 0.070112 | -0.102098 | 0.011322 | 0.491365 | 0.023143 | -0.235573 | -0.017123 | -0.043038 | 0.006438 | -0.054775 | 0.009424 | -0.003112 | 0.044689 | -0.102098 | -0.089004 | 0.142925 | 0.076980 | 0.230814 | 0.072982 | -0.164367 | 0.493776 | -0.059970 | 0.014303 | 0.144662 | 0.148693 | -0.110850 | -0.021860 | -0.002175 | 0.098434 |
| Net Value Per Share (A) | 0.505407 | 0.531790 | 0.502000 | 0.145031 | 0.143262 | 0.019218 | 0.033015 | 0.031347 | 0.019644 | 0.030835 | 0.091263 | 0.087500 | 0.158520 | 0.050345 | 0.130495 | 0.999342 | 1.000000 | 0.999837 | 0.755409 | 0.346511 | -0.008193 | 0.606954 | 0.725956 | -0.013689 | 0.016014 | 0.056679 | 0.056375 | 0.043767 | -0.018650 | -0.030089 | 0.060835 | 0.090377 | 0.010446 | -0.002874 | -0.008085 | 0.008547 | -0.249925 | 0.249925 | 0.052245 | -0.125321 | -0.028915 | 0.602367 | 0.705800 | -0.089992 | 0.082674 | -0.019675 | -0.002411 | -0.080248 | -0.081773 | -0.032440 | -0.017740 | 0.261381 | -0.009268 | 0.199598 | 0.115606 | 0.047816 | 0.185631 | -0.014773 | -0.032756 | -0.199086 | 0.195903 | -0.005589 | -0.015540 | 0.044624 | 0.070331 | -0.102539 | 0.011623 | 0.492760 | 0.023136 | -0.234878 | -0.016616 | -0.042926 | 0.006426 | -0.053797 | 0.009398 | -0.003094 | 0.044624 | -0.102539 | -0.090725 | 0.142930 | 0.076917 | 0.230629 | 0.073000 | -0.165083 | 0.493803 | -0.059780 | 0.014424 | 0.145032 | 0.148872 | -0.111797 | -0.021781 | -0.002358 | 0.098721 |
| Net Value Per Share (C) | 0.505281 | 0.531821 | 0.501907 | 0.145057 | 0.143288 | 0.019240 | 0.033035 | 0.031367 | 0.019632 | 0.030840 | 0.091197 | 0.087063 | 0.158255 | 0.050159 | 0.130665 | 0.999179 | 0.999837 | 1.000000 | 0.755217 | 0.346243 | -0.008222 | 0.606895 | 0.725825 | -0.013708 | 0.016022 | 0.056682 | 0.056378 | 0.043751 | -0.018389 | -0.030099 | 0.060802 | 0.090312 | 0.010429 | -0.002013 | -0.008056 | 0.008514 | -0.249463 | 0.249463 | 0.052137 | -0.125188 | -0.028923 | 0.602277 | 0.705621 | -0.089777 | 0.082434 | -0.019724 | -0.002455 | -0.080054 | -0.081749 | -0.032455 | -0.017757 | 0.261477 | -0.009310 | 0.199475 | 0.115310 | 0.047968 | 0.185300 | -0.014794 | -0.031013 | -0.198721 | 0.195621 | -0.005580 | -0.015649 | 0.044538 | 0.070383 | -0.102431 | 0.011509 | 0.492734 | 0.023117 | -0.235062 | -0.016690 | -0.043121 | 0.006431 | -0.053930 | 0.009403 | -0.003107 | 0.044538 | -0.102431 | -0.090667 | 0.143017 | 0.076871 | 0.230342 | 0.073080 | -0.165011 | 0.493822 | -0.059826 | 0.014335 | 0.145058 | 0.148906 | -0.111682 | -0.021674 | -0.002328 | 0.098390 |
| Persistent EPS in the Last Four Seasons | 0.775006 | 0.764828 | 0.764597 | 0.256722 | 0.254753 | 0.020420 | 0.033726 | 0.030768 | 0.018148 | 0.032051 | 0.080969 | 0.076486 | 0.197705 | 0.059506 | 0.169345 | 0.755568 | 0.755409 | 0.755217 | 1.000000 | 0.455794 | -0.009690 | 0.876769 | 0.955591 | -0.002178 | 0.025088 | 0.086080 | 0.086083 | 0.023707 | -0.036743 | -0.022101 | 0.080140 | 0.165224 | 0.011286 | -0.004244 | 0.000007 | -0.011383 | -0.177429 | 0.177429 | 0.052178 | -0.144138 | -0.053962 | 0.873641 | 0.959461 | -0.037986 | 0.214710 | -0.019997 | 0.007838 | -0.071460 | -0.129457 | 0.066033 | -0.011412 | 0.351589 | -0.009403 | 0.253188 | 0.215097 | 0.176931 | 0.240956 | -0.006744 | -0.034404 | -0.097689 | 0.248241 | 0.000283 | 0.031821 | 0.107310 | 0.121854 | -0.094966 | 0.019496 | 0.492078 | 0.023013 | -0.177996 | -0.000412 | -0.029352 | 0.002192 | -0.034256 | 0.007255 | -0.006477 | 0.107310 | -0.094966 | -0.114381 | 0.222378 | 0.123403 | 0.333636 | 0.129661 | -0.154690 | 0.691152 | -0.033509 | 0.003791 | 0.256723 | 0.222961 | -0.114114 | -0.018829 | 0.008039 | 0.036722 |
| Cash Flow Per Share | 0.379839 | 0.326239 | 0.366216 | 0.163192 | 0.163163 | 0.014244 | 0.017617 | 0.016140 | 0.000758 | 0.016343 | 0.007253 | 0.052162 | 0.353883 | 0.015297 | 0.068363 | 0.346904 | 0.346511 | 0.346243 | 0.455794 | 1.000000 | -0.006983 | 0.460740 | 0.439257 | -0.104689 | -0.005481 | 0.001412 | 0.000488 | -0.000375 | 0.054534 | -0.009383 | -0.187418 | 0.650279 | 0.000581 | -0.025135 | 0.003575 | 0.003779 | -0.158117 | 0.158117 | -0.030964 | -0.099198 | -0.009521 | 0.465102 | 0.445817 | -0.125508 | 0.052539 | -0.020618 | -0.010239 | -0.037357 | 0.013248 | 0.089648 | -0.008271 | 0.062192 | -0.010749 | 0.073991 | 0.098143 | -0.037778 | 0.253610 | -0.013814 | -0.022608 | -0.147717 | 0.415122 | -0.009112 | 0.028555 | 0.006296 | -0.009904 | -0.045022 | 0.031188 | 0.225557 | 0.011126 | -0.013919 | 0.042716 | -0.028332 | -0.000240 | -0.058218 | 0.004107 | -0.023987 | 0.006296 | -0.045022 | -0.037488 | 0.246791 | 0.129250 | 0.715003 | 0.199675 | -0.052804 | 0.292252 | -0.023591 | 0.002721 | 0.163190 | 0.074250 | -0.047298 | -0.006200 | 0.001358 | 0.052117 |
| Revenue Per Share (Yuan ¥) | -0.015932 | -0.011829 | -0.014359 | 0.117045 | 0.117196 | -0.044460 | 0.004931 | 0.005594 | 0.118316 | 0.051607 | -0.015838 | -0.019291 | 0.201679 | -0.003904 | -0.021313 | -0.008235 | -0.008193 | -0.008222 | -0.009690 | -0.006983 | 1.000000 | -0.014822 | -0.011663 | 0.000192 | -0.000427 | -0.058294 | -0.058071 | -0.003503 | -0.010670 | -0.000353 | -0.001965 | 0.000720 | -0.000311 | -0.000880 | -0.000710 | 0.029592 | -0.019679 | 0.019679 | 0.128149 | 0.001181 | 0.004573 | -0.014768 | -0.011463 | -0.009711 | -0.035980 | 0.264346 | -0.000985 | -0.011734 | -0.010462 | -0.020826 | 0.275742 | -0.025973 | -0.000982 | 0.020298 | -0.014145 | 0.000271 | 0.018125 | -0.000538 | -0.001871 | -0.026361 | 0.127459 | -0.000279 | 0.037494 | -0.027977 | 0.001783 | -0.006360 | -0.002441 | -0.004366 | -0.000558 | -0.017885 | -0.010894 | -0.016476 | 0.184916 | -0.021618 | 0.037165 | -0.000311 | -0.027977 | -0.006360 | 0.010574 | 0.015799 | 0.156966 | -0.002226 | 0.003660 | -0.006224 | -0.008315 | -0.001272 | 0.027256 | 0.117044 | -0.001104 | -0.002132 | -0.001140 | -0.000053 | 0.233203 |
| Operating Profit Per Share (Yuan ¥) | 0.687201 | 0.654253 | 0.659834 | 0.267944 | 0.267021 | 0.022397 | 0.026314 | 0.024137 | -0.001601 | 0.024516 | 0.071799 | 0.068738 | 0.191974 | 0.018363 | 0.197467 | 0.607623 | 0.606954 | 0.606895 | 0.876769 | 0.460740 | -0.014822 | 1.000000 | 0.861813 | -0.002527 | 0.027941 | 0.067502 | 0.067182 | 0.012131 | -0.041334 | -0.022639 | 0.076453 | 0.205792 | -0.001014 | 0.000720 | 0.000004 | -0.013880 | -0.078056 | 0.078056 | 0.043273 | -0.107135 | -0.056626 | 0.998696 | 0.886157 | 0.020064 | 0.282591 | -0.019500 | 0.002794 | -0.046790 | -0.119366 | 0.142435 | -0.009899 | 0.367024 | -0.008020 | 0.266177 | 0.266637 | 0.260501 | 0.236888 | -0.007192 | -0.023966 | -0.003496 | 0.245701 | -0.003723 | 0.037542 | 0.135996 | 0.126546 | -0.055461 | 0.022299 | 0.415226 | 0.015660 | -0.070742 | 0.008130 | -0.001263 | -0.000435 | -0.012514 | 0.005120 | -0.006732 | 0.135996 | -0.055461 | -0.094285 | 0.218511 | 0.118816 | 0.338206 | 0.126165 | -0.145741 | 0.577846 | -0.032299 | 0.001169 | 0.267946 | 0.183601 | -0.077102 | -0.015936 | 0.006331 | -0.009316 |
| Per Share Net profit before tax (Yuan ¥) | 0.750564 | 0.752578 | 0.722940 | 0.247789 | 0.246004 | 0.020219 | 0.034046 | 0.030621 | 0.019279 | 0.030487 | 0.081428 | 0.066085 | 0.177008 | 0.061431 | 0.208763 | 0.726321 | 0.725956 | 0.725825 | 0.955591 | 0.439257 | -0.011663 | 0.861813 | 1.000000 | -0.000496 | 0.029782 | 0.090223 | 0.090368 | 0.024448 | -0.055298 | -0.037629 | 0.120135 | 0.168160 | 0.009684 | -0.006931 | -0.001111 | -0.008518 | -0.158897 | 0.158897 | 0.047935 | -0.142138 | -0.047234 | 0.858310 | 0.962723 | -0.031613 | 0.230325 | -0.020045 | 0.004424 | -0.073633 | -0.133878 | 0.089929 | -0.014148 | 0.325783 | -0.008783 | 0.238435 | 0.219267 | 0.175784 | 0.222925 | -0.006873 | -0.034862 | -0.079795 | 0.222474 | -0.002642 | 0.033599 | 0.106568 | 0.114848 | -0.089174 | 0.047070 | 0.473736 | 0.022244 | -0.156954 | 0.009443 | -0.024310 | 0.001388 | -0.029827 | 0.006428 | -0.006482 | 0.106568 | -0.089174 | -0.110478 | 0.224392 | 0.123983 | 0.308200 | 0.105591 | -0.148721 | 0.671748 | -0.028837 | 0.008267 | 0.247791 | 0.218389 | -0.107727 | -0.017885 | 0.008143 | 0.028185 |
| Realized Sales Gross Profit Growth Rate | 0.000591 | 0.003277 | 0.002142 | 0.014172 | 0.014188 | 0.000831 | 0.001246 | 0.001226 | 0.000484 | 0.001207 | -0.008170 | -0.011151 | -0.017070 | -0.003299 | 0.001507 | -0.013744 | -0.013689 | -0.013708 | -0.002178 | -0.104689 | 0.000192 | -0.002527 | -0.000496 | 1.000000 | 0.002192 | 0.006470 | 0.006444 | 0.000747 | -0.035116 | -0.000698 | 0.005843 | -0.049794 | -0.000285 | 0.003220 | -0.000260 | -0.000589 | 0.011461 | -0.011461 | 0.001248 | -0.004898 | -0.001484 | -0.002517 | -0.002674 | 0.018133 | 0.096856 | -0.000630 | -0.000937 | -0.007853 | -0.005697 | 0.061691 | -0.000539 | 0.001962 | -0.001002 | 0.012096 | 0.015756 | 0.028146 | -0.005605 | -0.000494 | -0.001799 | 0.021559 | -0.023343 | -0.000121 | -0.001290 | 0.016519 | 0.008683 | 0.004456 | -0.001783 | -0.008902 | -0.000307 | 0.050982 | 0.019388 | 0.003755 | 0.000023 | -0.009126 | -0.000182 | -0.000301 | 0.016519 | 0.004456 | -0.004258 | -0.004178 | 0.002393 | -0.046844 | -0.005539 | -0.002813 | 0.003064 | 0.002692 | -0.000764 | 0.014172 | 0.001952 | 0.001687 | -0.000672 | -0.000327 | -0.002302 |
| Operating Profit Growth Rate | 0.036511 | 0.042208 | 0.036144 | 0.022867 | 0.022778 | 0.004952 | 0.003909 | 0.002962 | -0.004200 | 0.002643 | 0.013374 | 0.012166 | 0.003731 | 0.001783 | 0.018784 | 0.016049 | 0.016014 | 0.016022 | 0.025088 | -0.005481 | -0.000427 | 0.027941 | 0.029782 | 0.002192 | 1.000000 | 0.639394 | 0.636793 | 0.100821 | 0.015553 | -0.004534 | 0.326720 | -0.000856 | 0.000089 | 0.000404 | 0.006090 | 0.000051 | -0.018100 | 0.018100 | 0.001654 | 0.002009 | -0.002708 | 0.027680 | 0.027997 | 0.005847 | 0.044088 | -0.025684 | -0.001395 | 0.002095 | -0.007908 | 0.033000 | -0.000853 | 0.015480 | 0.000348 | 0.024150 | 0.012454 | 0.018826 | 0.008048 | 0.000145 | -0.002419 | -0.006721 | 0.003664 | 0.000081 | 0.001564 | 0.011871 | -0.003457 | 0.005973 | 0.001546 | 0.021450 | 0.000806 | -0.004173 | 0.011499 | 0.012411 | -0.005974 | 0.014003 | -0.005175 | 0.000042 | 0.011871 | 0.005973 | -0.012554 | 0.008680 | 0.003389 | 0.002719 | -0.020307 | -0.021073 | 0.041046 | 0.000063 | -0.000180 | 0.022866 | -0.007570 | 0.000537 | 0.001247 | 0.004576 | 0.001725 |
| After-tax Net Profit Growth Rate | 0.115083 | 0.125384 | 0.117130 | 0.054639 | 0.054470 | 0.011328 | 0.035150 | 0.031223 | 0.043179 | 0.016584 | 0.007176 | 0.019958 | 0.019071 | 0.010106 | 0.047621 | 0.056818 | 0.056679 | 0.056682 | 0.086080 | 0.001412 | -0.058294 | 0.067502 | 0.090223 | 0.006470 | 0.639394 | 1.000000 | 0.996186 | 0.113051 | 0.008039 | -0.003660 | 0.223919 | 0.020630 | 0.000201 | 0.000796 | 0.006938 | 0.000535 | -0.030240 | 0.030240 | 0.003240 | -0.012188 | -0.004558 | 0.066870 | 0.088074 | -0.005673 | 0.059724 | -0.019195 | 0.003642 | -0.013040 | -0.005390 | 0.035121 | -0.129766 | 0.046260 | 0.000548 | 0.031928 | 0.028478 | 0.015090 | 0.020819 | 0.000469 | -0.005475 | -0.021936 | 0.015927 | 0.001996 | -0.019321 | -0.000929 | 0.004411 | -0.007517 | 0.005114 | 0.064121 | 0.002600 | -0.031487 | 0.005805 | 0.014418 | 0.000371 | 0.014097 | 0.000916 | 0.000381 | -0.000929 | -0.007517 | -0.018494 | 0.027411 | 0.015736 | 0.014993 | -0.006925 | -0.033598 | 0.119596 | 0.001185 | 0.002108 | 0.054639 | 0.020203 | -0.011685 | 0.002030 | 0.005373 | 0.001253 |
| Regular Net Profit Growth Rate | 0.115040 | 0.125872 | 0.117042 | 0.053430 | 0.053259 | 0.011227 | 0.034914 | 0.030964 | 0.042951 | 0.016415 | 0.009511 | 0.020703 | 0.018300 | 0.010021 | 0.047701 | 0.056518 | 0.056375 | 0.056378 | 0.086083 | 0.000488 | -0.058071 | 0.067182 | 0.090368 | 0.006444 | 0.636793 | 0.996186 | 1.000000 | 0.112904 | 0.008911 | -0.003649 | 0.223106 | 0.019473 | 0.000186 | 0.000783 | 0.006874 | 0.000524 | -0.030512 | 0.030512 | 0.003390 | -0.012109 | -0.004274 | 0.066552 | 0.088255 | -0.005827 | 0.060223 | -0.019103 | -0.002623 | -0.013363 | -0.004697 | 0.035224 | -0.128905 | 0.046101 | 0.000535 | 0.031764 | 0.028959 | 0.014820 | 0.020039 | 0.000461 | -0.005428 | -0.022080 | 0.015185 | 0.001776 | -0.019272 | -0.000933 | 0.004243 | -0.007521 | 0.004905 | 0.064422 | 0.002599 | -0.032827 | 0.005591 | 0.014146 | 0.000198 | 0.013462 | 0.000756 | 0.000376 | -0.000933 | -0.007521 | -0.018514 | 0.026884 | 0.015741 | 0.013393 | -0.007324 | -0.033395 | 0.119870 | 0.001166 | 0.002026 | 0.053430 | 0.020273 | -0.011705 | 0.002014 | 0.005329 | 0.001383 |
| Continuous Net Profit Growth Rate | 0.025234 | 0.024887 | 0.024414 | 0.009121 | 0.009117 | 0.001318 | 0.003013 | 0.002565 | 0.002855 | 0.001842 | -0.006644 | 0.007842 | 0.003902 | 0.000499 | 0.016857 | 0.043865 | 0.043767 | 0.043751 | 0.023707 | -0.000375 | -0.003503 | 0.012131 | 0.024448 | 0.000747 | 0.100821 | 0.113051 | 0.112904 | 1.000000 | 0.010175 | -0.000270 | 0.036198 | -0.003750 | -0.000060 | -0.000120 | 0.001145 | -0.000154 | -0.025205 | 0.025205 | 0.000964 | -0.007715 | -0.001580 | 0.012002 | 0.025103 | -0.006369 | -0.000215 | -0.002989 | -0.000717 | -0.012954 | -0.003729 | -0.001916 | -0.007249 | 0.005900 | -0.000199 | 0.004795 | -0.003351 | -0.007937 | -0.001619 | -0.000081 | -0.001146 | -0.016883 | 0.003403 | 0.060758 | -0.001519 | 0.003392 | 0.001327 | -0.006179 | -0.000507 | 0.011965 | 0.000653 | -0.013009 | -0.002073 | -0.007177 | -0.000385 | -0.006509 | -0.000180 | -0.000043 | 0.003392 | -0.006179 | -0.008501 | -0.000208 | 0.000120 | -0.001262 | -0.000834 | -0.009174 | 0.024257 | -0.000307 | 0.002108 | 0.009122 | 0.006638 | -0.007433 | 0.000014 | 0.001086 | 0.014498 |
| Total Asset Growth Rate | 0.019635 | 0.026977 | 0.022104 | 0.016013 | 0.016583 | 0.034465 | 0.037633 | 0.037066 | -0.008224 | 0.034962 | 0.014168 | 0.023189 | 0.055389 | -0.018700 | 0.067886 | -0.018871 | -0.018650 | -0.018389 | -0.036743 | 0.054534 | -0.010670 | -0.041334 | -0.055298 | -0.035116 | 0.015553 | 0.008039 | 0.008911 | 0.010175 | 1.000000 | -0.008688 | -0.037136 | 0.063710 | 0.006779 | -0.013451 | 0.007325 | -0.012469 | -0.049191 | 0.049191 | -0.033926 | -0.019354 | -0.001136 | -0.038978 | -0.028557 | -0.045200 | -0.072393 | -0.030866 | -0.035608 | 0.030277 | 0.009155 | -0.084994 | -0.017423 | 0.003777 | -0.010750 | -0.025894 | -0.068393 | -0.077647 | -0.064079 | -0.002528 | 0.006808 | -0.069315 | 0.064132 | 0.010147 | -0.011164 | -0.044135 | -0.013873 | -0.036283 | 0.016918 | 0.099163 | -0.020221 | -0.079619 | 0.000739 | 0.032340 | 0.004511 | 0.064419 | 0.004524 | 0.006570 | -0.044135 | -0.036283 | -0.013855 | -0.104641 | -0.064552 | 0.100864 | -0.048385 | -0.027703 | 0.080031 | -0.038909 | -0.013174 | 0.016014 | 0.032565 | -0.033052 | 0.005520 | 0.001723 | -0.015962 |
| Net Value Growth Rate | -0.021930 | -0.063970 | -0.026127 | -0.017448 | -0.017451 | -0.000207 | -0.000998 | -0.000858 | -0.001505 | -0.000354 | -0.008456 | -0.010300 | -0.007420 | 0.013614 | -0.011380 | -0.030175 | -0.030089 | -0.030099 | -0.022101 | -0.009383 | -0.000353 | -0.022639 | -0.037629 | -0.000698 | -0.004534 | -0.003660 | -0.003649 | -0.000270 | -0.008688 | 1.000000 | -0.004653 | -0.013420 | -0.000166 | -0.000470 | -0.000650 | -0.000360 | 0.056508 | -0.056508 | -0.001914 | -0.077942 | -0.001584 | -0.022705 | -0.038361 | -0.012109 | -0.006532 | -0.000631 | -0.000526 | -0.009009 | 0.036004 | 0.063203 | -0.000234 | -0.012497 | -0.000524 | -0.044672 | -0.019193 | -0.025684 | -0.010481 | 0.075915 | -0.000999 | 0.024605 | -0.006967 | -0.000552 | -0.001316 | -0.016129 | 0.063047 | -0.055279 | -0.001303 | -0.047416 | -0.000709 | 0.042600 | 0.008879 | 0.006129 | -0.000367 | -0.002081 | 0.000039 | -0.000166 | -0.016129 | -0.055279 | -0.071401 | -0.008237 | -0.001937 | -0.018421 | 0.013669 | 0.117590 | -0.072408 | -0.000679 | -0.010080 | -0.017450 | 0.068054 | -0.068649 | -0.000697 | -0.000446 | -0.010685 |
| Total Asset Return Growth Rate Ratio | 0.079906 | 0.081982 | 0.079972 | 0.026545 | 0.026463 | 0.003677 | 0.005004 | 0.004379 | 0.001115 | 0.004203 | -0.003863 | 0.029752 | -0.002779 | 0.004281 | 0.009976 | 0.060968 | 0.060835 | 0.060802 | 0.080140 | -0.187418 | -0.001965 | 0.076453 | 0.120135 | 0.005843 | 0.326720 | 0.223919 | 0.223106 | 0.036198 | -0.037136 | -0.004653 | 1.000000 | -0.032078 | -0.000737 | -0.000135 | 0.001643 | -0.000794 | -0.001100 | 0.001100 | 0.002237 | -0.009720 | -0.004014 | 0.052078 | 0.062054 | 0.007472 | 0.041372 | -0.011291 | -0.001482 | 0.013148 | -0.011274 | 0.027419 | -0.003493 | 0.022437 | -0.001571 | 0.026427 | 0.024787 | 0.038872 | 0.013891 | 0.000358 | -0.002480 | 0.016977 | -0.003354 | 0.000492 | 0.001824 | 0.027073 | 0.015315 | -0.001407 | 0.001410 | 0.025056 | 0.000794 | -0.015412 | -0.005130 | -0.009262 | -0.001643 | 0.018438 | -0.000610 | -0.000206 | 0.027073 | -0.001407 | -0.011811 | 0.038569 | 0.019516 | -0.023001 | 0.022967 | -0.008920 | 0.062183 | 0.006583 | -0.000310 | 0.026544 | 0.019467 | -0.005198 | -0.000310 | 0.001498 | -0.002454 |
| Cash Reinvestment % | 0.296158 | 0.263615 | 0.292008 | 0.122676 | 0.122787 | 0.014955 | 0.017801 | 0.016692 | -0.000606 | 0.017400 | -0.003016 | 0.042269 | 0.344660 | 0.000198 | 0.065329 | 0.090651 | 0.090377 | 0.090312 | 0.165224 | 0.650279 | 0.000720 | 0.205792 | 0.168160 | -0.049794 | -0.000856 | 0.020630 | 0.019473 | -0.003750 | 0.063710 | -0.013420 | -0.032078 | 1.000000 | -0.004268 | -0.022360 | 0.011983 | 0.014103 | -0.134276 | 0.134276 | -0.047478 | -0.185446 | -0.006771 | 0.203603 | 0.158675 | -0.195899 | 0.055909 | -0.018598 | -0.012798 | -0.021508 | 0.015146 | 0.103647 | -0.003601 | -0.014826 | -0.005393 | 0.056922 | 0.071882 | -0.047558 | 0.167651 | -0.028415 | -0.016929 | -0.138271 | 0.412683 | -0.005482 | 0.020751 | -0.011324 | 0.036706 | -0.167594 | 0.023165 | 0.205944 | 0.004849 | -0.060663 | 0.037312 | -0.022286 | 0.003031 | -0.041483 | 0.006674 | -0.034956 | -0.011324 | -0.167594 | -0.018924 | 0.247814 | 0.136652 | 0.738276 | 0.089089 | -0.052971 | 0.252716 | -0.041889 | 0.008558 | 0.122675 | 0.162473 | -0.133686 | 0.003717 | 0.005525 | 0.045546 |
| Current Ratio | 0.013196 | 0.014102 | 0.012975 | 0.024945 | 0.024984 | 0.000833 | 0.003164 | 0.002685 | 0.004344 | 0.002887 | -0.007464 | -0.009092 | 0.002257 | -0.001840 | -0.000010 | 0.010462 | 0.010446 | 0.010429 | 0.011286 | 0.000581 | -0.000311 | -0.001014 | 0.009684 | -0.000285 | 0.000089 | 0.000201 | 0.000186 | -0.000060 | 0.006779 | -0.000166 | -0.000737 | -0.004268 | 1.000000 | -0.000415 | -0.000379 | -0.000318 | -0.015860 | 0.015860 | 0.000118 | -0.002542 | 0.004485 | -0.000993 | 0.010726 | -0.007975 | -0.015341 | -0.000557 | -0.000464 | -0.007943 | -0.004931 | -0.009229 | -0.000206 | 0.038839 | -0.000463 | -0.020116 | -0.023992 | -0.029002 | -0.010622 | -0.000254 | -0.000882 | -0.012141 | 0.003547 | -0.000486 | -0.001161 | 0.004915 | -0.005779 | -0.003582 | -0.001150 | 0.015432 | 0.003411 | -0.011762 | -0.002240 | -0.007291 | -0.000487 | -0.003635 | -0.000101 | -0.000147 | 0.004915 | -0.003582 | -0.002434 | -0.006635 | -0.005370 | 0.001195 | -0.002907 | 0.347630 | 0.014946 | -0.000599 | 0.008178 | 0.024946 | 0.002489 | -0.003741 | -0.000574 | -0.000150 | 0.010228 |
| Quick Ratio | -0.026336 | -0.018412 | -0.024232 | 0.001379 | 0.001418 | 0.000323 | -0.017376 | -0.015369 | -0.035784 | -0.016857 | 0.017687 | -0.025702 | -0.012833 | -0.005201 | -0.024875 | -0.002909 | -0.002874 | -0.002013 | -0.004244 | -0.025135 | -0.000880 | 0.000720 | -0.006931 | 0.003220 | 0.000404 | 0.000796 | 0.000783 | -0.000120 | -0.013451 | -0.000470 | -0.000135 | -0.022360 | -0.000415 | 1.000000 | 0.003143 | -0.000898 | 0.033410 | -0.033410 | 0.001013 | 0.016767 | -0.001224 | 0.000816 | -0.006861 | 0.037141 | -0.037770 | -0.001574 | 0.016026 | -0.019058 | 0.001797 | -0.020018 | -0.000583 | 0.074423 | -0.001309 | -0.009233 | -0.063064 | 0.027031 | -0.029790 | -0.000717 | 0.151987 | 0.048153 | -0.013926 | 0.008347 | -0.003283 | 0.028012 | 0.003585 | 0.014777 | -0.003252 | -0.011885 | -0.000314 | -0.018296 | -0.014514 | 0.007069 | 0.002905 | -0.022126 | 0.000025 | -0.000415 | 0.028012 | 0.014777 | -0.003287 | -0.008936 | -0.004089 | -0.023447 | -0.005717 | 0.006420 | -0.017779 | -0.001694 | -0.014929 | 0.001381 | -0.002374 | 0.009645 | -0.000083 | 0.001867 | -0.017084 |
| Interest Expense Ratio | 0.003988 | 0.005440 | 0.005187 | -0.002366 | -0.002509 | 0.001156 | 0.001630 | 0.001612 | 0.000465 | 0.001479 | 0.024446 | -0.013572 | 0.000513 | -0.004360 | 0.034656 | -0.008175 | -0.008085 | -0.008056 | 0.000007 | 0.003575 | -0.000710 | 0.000004 | -0.001111 | -0.000260 | 0.006090 | 0.006938 | 0.006874 | 0.001145 | 0.007325 | -0.000650 | 0.001643 | 0.011983 | -0.000379 | 0.003143 | 1.000000 | -0.000797 | 0.023196 | -0.023196 | -0.007224 | 0.007844 | -0.000074 | -0.000028 | -0.001346 | 0.005507 | -0.002619 | -0.001307 | 0.001691 | -0.011133 | 0.025936 | 0.006706 | -0.003175 | 0.008881 | -0.002950 | -0.014455 | -0.016619 | -0.009594 | -0.017431 | 0.001858 | 0.010633 | 0.006254 | -0.001680 | 0.004150 | -0.001318 | -0.022008 | -0.002860 | 0.004542 | -0.006971 | 0.000065 | -0.000430 | -0.013466 | 0.030494 | 0.007865 | -0.000240 | 0.002048 | -0.000038 | -0.013903 | -0.022008 | 0.004542 | 0.008543 | -0.011678 | -0.004043 | 0.008880 | -0.011266 | 0.014302 | 0.004969 | -0.000586 | -0.000076 | -0.002365 | 0.003604 | 0.006926 | 0.016829 | -0.034321 | -0.012626 |
| Total debt/Total net worth | -0.022208 | -0.010323 | -0.021161 | -0.022360 | -0.022354 | -0.001507 | -0.001964 | -0.001379 | -0.000284 | 0.000075 | -0.016164 | -0.019292 | 0.349639 | -0.003984 | -0.010497 | 0.008546 | 0.008547 | 0.008514 | -0.011383 | 0.003779 | 0.029592 | -0.013880 | -0.008518 | -0.000589 | 0.000051 | 0.000535 | 0.000524 | -0.000154 | -0.012469 | -0.000360 | -0.000794 | 0.014103 | -0.000318 | -0.000898 | -0.000797 | 1.000000 | -0.054049 | 0.054049 | 0.030256 | -0.008080 | -0.001269 | -0.013835 | -0.008575 | -0.008896 | -0.027847 | 0.008627 | -0.001005 | -0.013441 | 0.016062 | -0.018407 | -0.000446 | -0.025393 | -0.001002 | 0.051384 | -0.002856 | 0.016898 | 0.019683 | -0.000549 | -0.001910 | -0.045142 | 0.265843 | -0.000574 | -0.002515 | -0.006780 | 0.008081 | -0.009967 | -0.002491 | -0.009129 | -0.000725 | -0.015686 | -0.011118 | -0.016815 | 0.015300 | -0.016561 | 0.004189 | -0.000318 | -0.006780 | -0.009967 | -0.006328 | 0.022285 | 0.149285 | 0.023123 | 0.006953 | -0.025966 | -0.008056 | -0.001298 | 0.002556 | -0.022363 | -0.000700 | -0.010045 | -0.001262 | -0.000431 | 0.338898 |
| Debt ratio % | -0.261427 | -0.259972 | -0.264734 | -0.245460 | -0.245606 | 0.010397 | -0.003906 | -0.006174 | -0.033214 | -0.001192 | 0.143833 | -0.045162 | -0.285010 | -0.059911 | -0.009724 | -0.249146 | -0.249925 | -0.249463 | -0.177429 | -0.158117 | -0.019679 | -0.078056 | -0.158897 | 0.011461 | -0.018100 | -0.030240 | -0.030512 | -0.025205 | -0.049191 | 0.056508 | -0.001100 | -0.134276 | -0.015860 | 0.033410 | 0.023196 | -0.054049 | 1.000000 | -1.000000 | -0.010523 | 0.329109 | 0.058862 | -0.077250 | -0.164110 | 0.417868 | 0.237458 | 0.029924 | 0.006062 | -0.043629 | -0.010258 | 0.436160 | 0.037412 | 0.007547 | 0.010321 | -0.528797 | -0.084232 | 0.109964 | -0.357605 | 0.039455 | 0.082804 | 0.842583 | -0.333429 | 0.002064 | 0.026934 | -0.082322 | -0.105008 | 0.343692 | -0.006936 | -0.235423 | -0.027995 | 0.037513 | 0.124880 | 0.203370 | -0.031715 | 0.138934 | -0.018201 | 0.023418 | -0.082322 | 0.343692 | 0.244974 | -0.066502 | -0.076983 | -0.268159 | -0.003178 | 0.428180 | -0.281422 | 0.041055 | -0.050218 | -0.245461 | -0.123986 | 0.349250 | 0.017982 | 0.012571 | -0.625879 |
| Net worth/Assets | 0.261427 | 0.259972 | 0.264734 | 0.245460 | 0.245606 | -0.010397 | 0.003906 | 0.006174 | 0.033214 | 0.001192 | -0.143833 | 0.045162 | 0.285010 | 0.059911 | 0.009724 | 0.249146 | 0.249925 | 0.249463 | 0.177429 | 0.158117 | 0.019679 | 0.078056 | 0.158897 | -0.011461 | 0.018100 | 0.030240 | 0.030512 | 0.025205 | 0.049191 | -0.056508 | 0.001100 | 0.134276 | 0.015860 | -0.033410 | -0.023196 | 0.054049 | -1.000000 | 1.000000 | 0.010523 | -0.329109 | -0.058862 | 0.077250 | 0.164110 | -0.417868 | -0.237458 | -0.029924 | -0.006062 | 0.043629 | 0.010258 | -0.436160 | -0.037412 | -0.007547 | -0.010321 | 0.528797 | 0.084232 | -0.109964 | 0.357605 | -0.039455 | -0.082804 | -0.842583 | 0.333429 | -0.002064 | -0.026934 | 0.082322 | 0.105008 | -0.343692 | 0.006936 | 0.235423 | 0.027995 | -0.037513 | -0.124880 | -0.203370 | 0.031715 | -0.138934 | 0.018201 | -0.023418 | 0.082322 | -0.343692 | -0.244974 | 0.066502 | 0.076983 | 0.268159 | 0.003178 | -0.428180 | 0.281422 | -0.041055 | 0.050218 | 0.245461 | 0.123986 | -0.349250 | -0.017982 | -0.012571 | 0.625879 |
| Long-term fund suitability ratio (A) | 0.002967 | 0.020707 | 0.003869 | 0.006020 | 0.006189 | -0.000833 | -0.001907 | -0.001227 | -0.001809 | -0.000187 | 0.008990 | -0.047014 | 0.038308 | 0.002245 | -0.012952 | 0.052254 | 0.052245 | 0.052137 | 0.052178 | -0.030964 | 0.128149 | 0.043273 | 0.047935 | 0.001248 | 0.001654 | 0.003240 | 0.003390 | 0.000964 | -0.033926 | -0.001914 | 0.002237 | -0.047478 | 0.000118 | 0.001013 | -0.007224 | 0.030256 | -0.010523 | 0.010523 | 1.000000 | -0.000684 | 0.001385 | 0.043298 | 0.049832 | 0.048349 | 0.045883 | 0.007167 | 0.023772 | -0.040272 | -0.048631 | 0.023839 | -0.002326 | 0.178720 | 0.342925 | 0.077891 | 0.040821 | 0.104518 | 0.052845 | -0.001740 | 0.013864 | 0.036649 | 0.026117 | 0.000936 | 0.000367 | 0.074428 | 0.031667 | 0.013201 | -0.003992 | 0.010451 | 0.052586 | -0.018443 | 0.015572 | -0.001652 | 0.031637 | -0.014006 | 0.008499 | 0.393705 | 0.074428 | 0.013201 | -0.021716 | 0.027432 | 0.052517 | -0.031758 | 0.011582 | -0.021608 | 0.016566 | -0.003589 | 0.003389 | 0.006020 | 0.008679 | 0.001791 | -0.004319 | -0.000342 | 0.104344 |
| Borrowing dependency | -0.161671 | -0.161868 | -0.158618 | -0.085733 | -0.085598 | 0.001092 | -0.004654 | -0.004395 | -0.012037 | -0.002887 | 0.023977 | -0.045529 | -0.078763 | -0.001729 | -0.053177 | -0.123991 | -0.125321 | -0.125188 | -0.144138 | -0.099198 | 0.001181 | -0.107135 | -0.142138 | -0.004898 | 0.002009 | -0.012188 | -0.012109 | -0.007715 | -0.019354 | -0.077942 | -0.009720 | -0.185446 | -0.002542 | 0.016767 | 0.007844 | -0.008080 | 0.329109 | -0.329109 | -0.000684 | 1.000000 | 0.451737 | -0.104721 | -0.134924 | 0.700139 | -0.006443 | 0.034113 | 0.003798 | 0.010203 | 0.025232 | 0.177337 | 0.029949 | -0.009385 | -0.004641 | -0.192554 | -0.095568 | -0.017770 | -0.126419 | -0.067930 | 0.041094 | 0.229825 | -0.093248 | 0.003151 | 0.007702 | -0.075594 | -0.535714 | 0.892772 | -0.007554 | -0.117913 | -0.007016 | 0.022642 | 0.028854 | 0.063473 | -0.004406 | 0.004608 | -0.000476 | 0.015588 | -0.075594 | 0.892772 | 0.806889 | -0.026453 | -0.022185 | -0.117624 | -0.193543 | 0.124908 | -0.177781 | 0.011083 | -0.004183 | -0.085732 | -0.806478 | 0.955857 | 0.007260 | 0.001776 | -0.146012 |
| Contingent liabilities/Net worth | -0.035729 | -0.036183 | -0.034177 | -0.022258 | -0.022239 | 0.000247 | 0.002222 | 0.002117 | 0.003873 | 0.002166 | 0.010618 | -0.012449 | -0.009771 | -0.002792 | -0.012504 | -0.028406 | -0.028915 | -0.028923 | -0.053962 | -0.009521 | 0.004573 | -0.056626 | -0.047234 | -0.001484 | -0.002708 | -0.004558 | -0.004274 | -0.001580 | -0.001136 | -0.001584 | -0.004014 | -0.006771 | 0.004485 | -0.001224 | -0.000074 | -0.001269 | 0.058862 | -0.058862 | 0.001385 | 0.451737 | 1.000000 | -0.056837 | -0.049424 | 0.077200 | 0.020659 | 0.048691 | -0.000373 | -0.014864 | -0.006061 | 0.092726 | -0.000841 | -0.024133 | 0.005206 | -0.065879 | -0.015814 | -0.028117 | -0.019245 | -0.000942 | 0.011087 | 0.049283 | -0.011000 | -0.001142 | -0.000778 | -0.004683 | -0.767778 | 0.622905 | -0.001512 | -0.021729 | -0.000202 | -0.007404 | 0.001941 | 0.000955 | 0.001101 | -0.000351 | 0.000541 | -0.000599 | -0.004683 | 0.622905 | 0.462858 | -0.013533 | -0.004790 | -0.011931 | -0.295827 | 0.084231 | -0.037822 | -0.002446 | 0.001531 | -0.022258 | -0.352618 | 0.621808 | -0.001061 | -0.000620 | -0.018260 |
| Operating profit/Paid-in capital | 0.685028 | 0.651581 | 0.657274 | 0.267411 | 0.266483 | 0.022160 | 0.026091 | 0.023942 | -0.001473 | 0.024295 | 0.071972 | 0.067143 | 0.191723 | 0.016198 | 0.197419 | 0.603024 | 0.602367 | 0.602277 | 0.873641 | 0.465102 | -0.014768 | 0.998696 | 0.858310 | -0.002517 | 0.027680 | 0.066870 | 0.066552 | 0.012002 | -0.038978 | -0.022705 | 0.052078 | 0.203603 | -0.000993 | 0.000816 | -0.000028 | -0.013835 | -0.077250 | 0.077250 | 0.043298 | -0.104721 | -0.056837 | 1.000000 | 0.887370 | 0.020421 | 0.281435 | -0.019353 | 0.003185 | -0.047084 | -0.118531 | 0.142561 | -0.009861 | 0.366215 | -0.008027 | 0.264617 | 0.266090 | 0.259411 | 0.235767 | -0.007147 | -0.023862 | -0.002894 | 0.245298 | -0.003727 | 0.038217 | 0.135385 | 0.124996 | -0.053169 | 0.022794 | 0.413869 | 0.015657 | -0.069488 | 0.008815 | -0.000433 | -0.000332 | -0.012464 | 0.005206 | -0.006723 | 0.135385 | -0.053169 | -0.094417 | 0.216378 | 0.117421 | 0.337872 | 0.132425 | -0.144231 | 0.575833 | -0.032403 | 0.000975 | 0.267412 | 0.181150 | -0.075374 | -0.016052 | 0.006258 | -0.009528 |
| Net profit before tax/Paid-in capital | 0.753339 | 0.758234 | 0.726003 | 0.248104 | 0.246256 | 0.020015 | 0.033900 | 0.030568 | 0.019483 | 0.030334 | 0.082923 | 0.069545 | 0.186619 | 0.054972 | 0.209898 | 0.706646 | 0.705800 | 0.705621 | 0.959461 | 0.445817 | -0.011463 | 0.886157 | 0.962723 | -0.002674 | 0.027997 | 0.088074 | 0.088255 | 0.025103 | -0.028557 | -0.038361 | 0.062054 | 0.158675 | 0.010726 | -0.006861 | -0.001346 | -0.008575 | -0.164110 | 0.164110 | 0.049832 | -0.134924 | -0.049424 | 0.887370 | 1.000000 | -0.030408 | 0.221105 | -0.017993 | 0.005425 | -0.073815 | -0.134096 | 0.078373 | -0.013189 | 0.333660 | -0.008584 | 0.245821 | 0.222639 | 0.178409 | 0.228365 | -0.007288 | -0.035314 | -0.086021 | 0.233192 | -0.002615 | 0.036318 | 0.102487 | 0.115605 | -0.083819 | 0.022193 | 0.483355 | 0.023754 | -0.174815 | 0.008198 | -0.023197 | 0.001825 | -0.031204 | 0.006864 | -0.006716 | 0.102487 | -0.083819 | -0.111974 | 0.218673 | 0.122515 | 0.318991 | 0.130916 | -0.153719 | 0.683623 | -0.029308 | 0.007939 | 0.248106 | 0.215690 | -0.104149 | -0.018358 | 0.008172 | 0.031292 |
| Inventory and accounts receivable/Net value | -0.109888 | -0.078585 | -0.109501 | -0.086720 | -0.086762 | 0.011993 | 0.009042 | 0.008993 | -0.011027 | 0.008243 | 0.079747 | -0.047222 | -0.125484 | 0.035486 | 0.027022 | -0.089396 | -0.089992 | -0.089777 | -0.037986 | -0.125508 | -0.009711 | 0.020064 | -0.031613 | 0.018133 | 0.005847 | -0.005673 | -0.005827 | -0.006369 | -0.045200 | -0.012109 | 0.007472 | -0.195899 | -0.007975 | 0.037141 | 0.005507 | -0.008896 | 0.417868 | -0.417868 | 0.048349 | 0.700139 | 0.077200 | 0.020421 | -0.030408 | 1.000000 | 0.243057 | 0.042552 | 0.010940 | 0.002019 | -0.124019 | 0.453424 | 0.005265 | 0.118059 | 0.014592 | -0.002128 | 0.100201 | 0.337547 | -0.135674 | 0.015249 | 0.050400 | 0.452321 | -0.130778 | 0.008687 | -0.012449 | 0.149187 | 0.040188 | 0.661805 | 0.017873 | -0.064067 | -0.010417 | 0.037055 | 0.041233 | 0.112787 | -0.000594 | 0.049393 | 0.003459 | 0.035281 | 0.149187 | 0.661805 | 0.486097 | -0.025453 | -0.023877 | -0.168028 | -0.159920 | 0.027166 | -0.094770 | 0.027074 | -0.021709 | -0.086720 | -0.422238 | 0.670373 | -0.006647 | 0.006764 | -0.205183 |
| Total Asset Turnover | 0.210622 | 0.223528 | 0.194810 | -0.099661 | -0.100141 | 0.029456 | 0.029667 | 0.029504 | -0.012057 | 0.027711 | 0.195063 | 0.013498 | -0.057978 | 0.008551 | 0.193636 | 0.082026 | 0.082674 | 0.082434 | 0.214710 | 0.052539 | -0.035980 | 0.282591 | 0.230325 | 0.096856 | 0.044088 | 0.059724 | 0.060223 | -0.000215 | -0.072393 | -0.006532 | 0.041372 | 0.055909 | -0.015341 | -0.037770 | -0.002619 | -0.027847 | 0.237458 | -0.237458 | 0.045883 | -0.006443 | 0.020659 | 0.281435 | 0.221105 | 0.243057 | 1.000000 | -0.060229 | 0.004841 | -0.152298 | -0.299431 | 0.757414 | -0.023836 | 0.061330 | 0.035084 | 0.211156 | 0.468956 | 0.497405 | 0.108064 | -0.018390 | -0.047976 | 0.384428 | -0.030660 | -0.021439 | -0.013638 | 0.321993 | 0.100832 | 0.124257 | 0.018985 | 0.124320 | -0.014172 | 0.171386 | 0.353826 | 0.252558 | -0.006442 | 0.060722 | 0.007064 | -0.016059 | 0.321993 | 0.124257 | -0.020655 | 0.093131 | 0.037509 | 0.018741 | 0.052024 | -0.087297 | 0.188774 | 0.041944 | 0.007536 | -0.099661 | 0.041242 | 0.084243 | -0.020769 | 0.019358 | -0.198485 |
| Accounts Receivable Turnover | -0.033947 | -0.031262 | -0.033768 | 0.082342 | 0.082479 | -0.023171 | 0.089576 | 0.079920 | 0.236897 | 0.100604 | -0.028331 | -0.034508 | 0.005904 | -0.006983 | -0.019968 | -0.018647 | -0.019675 | -0.019724 | -0.019997 | -0.020618 | 0.264346 | -0.019500 | -0.020045 | -0.000630 | -0.025684 | -0.019195 | -0.019103 | -0.002989 | -0.030866 | -0.000631 | -0.011291 | -0.018598 | -0.000557 | -0.001574 | -0.001307 | 0.008627 | 0.029924 | -0.029924 | 0.007167 | 0.034113 | 0.048691 | -0.019353 | -0.017993 | 0.042552 | -0.060229 | 1.000000 | -0.001762 | -0.013201 | -0.001709 | -0.032071 | 0.032398 | -0.020483 | -0.001757 | -0.040971 | 0.009706 | -0.006896 | -0.017054 | -0.000962 | 0.036912 | 0.044754 | -0.005032 | 0.235196 | 0.048829 | 0.012795 | -0.015012 | 0.033322 | -0.001859 | -0.016678 | -0.000684 | -0.018303 | -0.019486 | -0.029472 | 0.074727 | -0.022508 | 0.052668 | -0.000557 | 0.012795 | 0.033322 | -0.008364 | 0.012486 | 0.022651 | -0.026659 | 0.005331 | 0.021219 | -0.025062 | 0.113731 | 0.004030 | 0.082343 | -0.006587 | 0.023753 | -0.002111 | -0.000497 | 0.010629 |
| Average Collection Days | 0.007019 | 0.009041 | 0.009921 | 0.022530 | 0.022596 | 0.001001 | 0.002198 | 0.002032 | 0.001987 | 0.002168 | -0.007935 | -0.028471 | 0.006638 | 0.016496 | -0.000443 | -0.002448 | -0.002411 | -0.002455 | 0.007838 | -0.010239 | -0.000985 | 0.002794 | 0.004424 | -0.000937 | -0.001395 | 0.003642 | -0.002623 | -0.000717 | -0.035608 | -0.000526 | -0.001482 | -0.012798 | -0.000464 | 0.016026 | 0.001691 | -0.001005 | 0.006062 | -0.006062 | 0.023772 | 0.003798 | -0.000373 | 0.003185 | 0.005425 | 0.010940 | 0.004841 | -0.001762 | 1.000000 | 0.007906 | -0.013746 | 0.005157 | -0.000653 | 0.088379 | -0.000539 | -0.027486 | -0.051592 | -0.011503 | -0.017105 | -0.000803 | 0.018409 | 0.020865 | 0.011150 | -0.009554 | -0.003676 | 0.024454 | -0.005380 | 0.007744 | -0.003641 | 0.010700 | 0.459146 | 0.040179 | 0.021400 | 0.003969 | -0.000392 | 0.014979 | 0.000073 | 0.006008 | 0.024454 | 0.007744 | -0.007994 | -0.002480 | -0.003105 | 0.004308 | -0.000835 | 0.068197 | 0.011002 | -0.001897 | 0.004210 | 0.022529 | 0.004093 | 0.003221 | -0.001342 | 0.000892 | 0.002744 |
| Inventory Turnover Rate (times) | -0.062660 | -0.054496 | -0.053605 | 0.047665 | 0.047079 | 0.009576 | 0.005560 | 0.005957 | -0.012145 | 0.006203 | -0.129214 | 0.001366 | -0.020141 | 0.007713 | -0.050769 | -0.080684 | -0.080248 | -0.080054 | -0.071460 | -0.037357 | -0.011734 | -0.046790 | -0.073633 | -0.007853 | 0.002095 | -0.013040 | -0.013363 | -0.012954 | 0.030277 | -0.009009 | 0.013148 | -0.021508 | -0.007943 | -0.019058 | -0.011133 | -0.013441 | -0.043629 | 0.043629 | -0.040272 | 0.010203 | -0.014864 | -0.047084 | -0.073815 | 0.002019 | -0.152298 | -0.013201 | 0.007906 | 1.000000 | 0.026388 | -0.115353 | 0.000193 | -0.064956 | -0.025242 | 0.068369 | -0.096696 | 0.023021 | -0.012603 | -0.011180 | -0.021335 | -0.059348 | -0.014231 | 0.000724 | -0.051987 | -0.014910 | 0.016940 | -0.018456 | 0.009397 | -0.048772 | -0.011172 | 0.064959 | -0.177299 | -0.014567 | -0.004528 | 0.017862 | -0.000165 | -0.007658 | -0.014910 | -0.018456 | 0.016432 | -0.015477 | -0.000805 | -0.033695 | -0.016848 | -0.054892 | -0.052668 | 0.009172 | -0.010280 | 0.047666 | -0.026296 | -0.008359 | 0.001189 | -0.011491 | 0.006726 |
| Fixed Assets Turnover Frequency | -0.065919 | -0.136964 | -0.061046 | 0.001239 | 0.001289 | 0.005232 | 0.003581 | 0.004241 | -0.005540 | 0.003439 | -0.055160 | 0.016872 | 0.052978 | -0.036213 | -0.135210 | -0.080971 | -0.081773 | -0.081749 | -0.129457 | 0.013248 | -0.010462 | -0.119366 | -0.133878 | -0.005697 | -0.007908 | -0.005390 | -0.004697 | -0.003729 | 0.009155 | 0.036004 | -0.011274 | 0.015146 | -0.004931 | 0.001797 | 0.025936 | 0.016062 | -0.010258 | 0.010258 | -0.048631 | 0.025232 | -0.006061 | -0.118531 | -0.134096 | -0.124019 | -0.299431 | -0.001709 | -0.013746 | 0.026388 | 1.000000 | -0.183710 | -0.006700 | -0.081441 | -0.015561 | -0.270936 | -0.322658 | -0.386115 | -0.117399 | -0.007697 | 0.017976 | -0.157522 | 0.024049 | 0.005493 | 0.011197 | -0.285401 | -0.097150 | -0.043548 | -0.026209 | -0.100233 | -0.006679 | -0.117786 | -0.042103 | -0.043957 | -0.000966 | -0.039828 | 0.001644 | -0.004931 | -0.285401 | -0.043548 | 0.053104 | -0.052169 | -0.025605 | 0.020639 | -0.026669 | 0.142589 | -0.122505 | -0.013477 | -0.007192 | 0.001240 | -0.013604 | -0.015134 | 0.018565 | 0.000537 | 0.044538 |
| Net Worth Turnover Rate (times) | 0.022896 | 0.036925 | 0.012763 | -0.136157 | -0.136335 | 0.016064 | 0.015513 | 0.015736 | -0.007916 | 0.014559 | 0.165135 | -0.025440 | -0.096663 | 0.046759 | 0.115270 | -0.032829 | -0.032440 | -0.032455 | 0.066033 | 0.089648 | -0.020826 | 0.142435 | 0.089929 | 0.061691 | 0.033000 | 0.035121 | 0.035224 | -0.001916 | -0.084994 | 0.063203 | 0.027419 | 0.103647 | -0.009229 | -0.020018 | 0.006706 | -0.018407 | 0.436160 | -0.436160 | 0.023839 | 0.177337 | 0.092726 | 0.142561 | 0.078373 | 0.453424 | 0.757414 | -0.032071 | 0.005157 | -0.115353 | -0.183710 | 1.000000 | -0.013763 | 0.011007 | 0.023205 | -0.016691 | 0.299791 | 0.358480 | -0.011920 | 0.105531 | -0.017293 | 0.499370 | -0.090999 | -0.017728 | -0.008378 | 0.194031 | 0.105820 | 0.267816 | 0.014774 | 0.001481 | -0.010976 | 0.114822 | 0.271284 | 0.209559 | -0.004805 | 0.041663 | 0.004045 | -0.008856 | 0.194031 | 0.267816 | 0.223240 | 0.044739 | 0.001048 | -0.057832 | 0.084083 | 0.126417 | 0.013776 | 0.059178 | -0.019169 | -0.136158 | -0.034699 | 0.273553 | -0.009233 | 0.017437 | -0.226456 |
| Revenue per person | -0.014834 | -0.014888 | -0.014545 | 0.019022 | 0.019060 | -0.027450 | -0.144956 | -0.146671 | -0.225032 | -0.064693 | -0.010492 | -0.012780 | -0.009018 | -0.002586 | -0.014119 | -0.017799 | -0.017740 | -0.017757 | -0.011412 | -0.008271 | 0.275742 | -0.009899 | -0.014148 | -0.000539 | -0.000853 | -0.129766 | -0.128905 | -0.007249 | -0.017423 | -0.000234 | -0.003493 | -0.003601 | -0.000206 | -0.000583 | -0.003175 | -0.000446 | 0.037412 | -0.037412 | -0.002326 | 0.029949 | -0.000841 | -0.009861 | -0.013189 | 0.005265 | -0.023836 | 0.032398 | -0.000653 | 0.000193 | -0.006700 | -0.013763 | 1.000000 | -0.007845 | -0.000651 | -0.030332 | -0.029715 | -0.025248 | -0.014162 | -0.000356 | -0.001240 | 0.006309 | -0.006774 | -0.001109 | 0.086595 | -0.027657 | -0.019620 | 0.013610 | -0.001617 | -0.016047 | -0.000969 | -0.012086 | -0.007217 | -0.010915 | -0.039606 | -0.014322 | -0.019829 | -0.000206 | -0.027657 | 0.013610 | 0.031914 | -0.004929 | -0.001922 | -0.009930 | -0.004661 | 0.027678 | -0.014309 | -0.000843 | 0.033598 | 0.019021 | -0.006549 | 0.022624 | -0.001084 | -0.002667 | -0.011295 |
| Operating profit per person | 0.301996 | 0.324942 | 0.304522 | 0.224976 | 0.224458 | 0.018248 | 0.020271 | 0.018334 | -0.003658 | 0.018580 | 0.126869 | -0.043509 | 0.131304 | -0.004588 | 0.067826 | 0.261330 | 0.261381 | 0.261477 | 0.351589 | 0.062192 | -0.025973 | 0.367024 | 0.325783 | 0.001962 | 0.015480 | 0.046260 | 0.046101 | 0.005900 | 0.003777 | -0.012497 | 0.022437 | -0.014826 | 0.038839 | 0.074423 | 0.008881 | -0.025393 | 0.007547 | -0.007547 | 0.178720 | -0.009385 | -0.024133 | 0.366215 | 0.333660 | 0.118059 | 0.061330 | -0.020483 | 0.088379 | -0.064956 | -0.081441 | 0.011007 | -0.007845 | 1.000000 | 0.000372 | 0.159706 | 0.000507 | 0.217071 | 0.048863 | 0.034704 | 0.061964 | 0.078832 | 0.118073 | -0.013443 | 0.005857 | 0.125118 | 0.086564 | 0.010680 | 0.017067 | 0.233582 | 0.124769 | -0.146846 | -0.006812 | 0.047514 | -0.006489 | 0.011304 | 0.000762 | -0.004646 | 0.125118 | 0.010680 | -0.051603 | 0.075271 | 0.053627 | 0.091023 | 0.018552 | -0.050780 | 0.306356 | -0.017804 | -0.009823 | 0.224980 | 0.096235 | -0.011415 | -0.006094 | 0.003220 | -0.030261 |
| Allocation rate per person | -0.012543 | -0.006035 | -0.012770 | -0.006953 | -0.007262 | 0.000726 | 0.000685 | 0.000702 | -0.000390 | 0.000652 | -0.009231 | -0.028073 | -0.001554 | 0.024254 | 0.005542 | -0.009322 | -0.009268 | -0.009310 | -0.009403 | -0.010749 | -0.000982 | -0.008020 | -0.008783 | -0.001002 | 0.000348 | 0.000548 | 0.000535 | -0.000199 | -0.010750 | -0.000524 | -0.001571 | -0.005393 | -0.000463 | -0.001309 | -0.002950 | -0.001002 | 0.010321 | -0.010321 | 0.342925 | -0.004641 | 0.005206 | -0.008027 | -0.008584 | 0.014592 | 0.035084 | -0.001757 | -0.000539 | -0.025242 | -0.015561 | 0.023205 | -0.000651 | 0.000372 | 1.000000 | 0.013102 | 0.035254 | 0.032207 | 0.001549 | -0.000800 | 0.004314 | 0.025642 | -0.000400 | -0.000525 | -0.003665 | 0.023199 | 0.008668 | 0.006436 | 0.001754 | 0.001538 | -0.000804 | 0.010872 | 0.013957 | 0.015921 | -0.000003 | 0.012579 | 0.000169 | 0.142653 | 0.023199 | 0.006436 | -0.008978 | -0.002807 | 0.000084 | -0.008625 | -0.000878 | -0.007121 | -0.002153 | -0.001892 | -0.000702 | -0.006952 | 0.000679 | 0.002286 | -0.001859 | -0.000957 | 0.000081 |
| Working Capital to Total Assets | 0.259680 | 0.303532 | 0.260151 | 0.246304 | 0.246221 | 0.025599 | 0.036347 | 0.040405 | 0.010800 | 0.035148 | -0.076724 | 0.011881 | 0.161520 | 0.045141 | 0.056103 | 0.198620 | 0.199598 | 0.199475 | 0.253188 | 0.073991 | 0.020298 | 0.266177 | 0.238435 | 0.012096 | 0.024150 | 0.031928 | 0.031764 | 0.004795 | -0.025894 | -0.044672 | 0.026427 | 0.056922 | -0.020116 | -0.009233 | -0.014455 | 0.051384 | -0.528797 | 0.528797 | 0.077891 | -0.192554 | -0.065879 | 0.264617 | 0.245821 | -0.002128 | 0.211156 | -0.040971 | -0.027486 | 0.068369 | -0.270936 | -0.016691 | -0.030332 | 0.159706 | 0.013102 | 1.000000 | 0.648464 | 0.714868 | 0.585770 | -0.033669 | -0.056213 | -0.364453 | 0.210938 | -0.012443 | -0.045342 | 0.251502 | 0.336555 | -0.161580 | 0.004957 | 0.217189 | -0.013297 | 0.108738 | -0.184527 | -0.208682 | 0.044792 | -0.140998 | 0.036276 | 0.001406 | 0.251502 | -0.161580 | -0.139294 | 0.233958 | 0.176265 | 0.131677 | 0.101122 | -0.625560 | 0.297217 | -0.008452 | 0.038209 | 0.246308 | 0.109766 | -0.177645 | -0.036716 | -0.008912 | 0.369149 |
| Quick Assets/Total Assets | 0.181993 | 0.202017 | 0.166311 | 0.152850 | 0.152805 | 0.026100 | 0.030819 | 0.031453 | -0.001557 | 0.028547 | -0.004686 | 0.064811 | 0.031135 | 0.025246 | 0.119155 | 0.115516 | 0.115606 | 0.115310 | 0.215097 | 0.098143 | -0.014145 | 0.266637 | 0.219267 | 0.015756 | 0.012454 | 0.028478 | 0.028959 | -0.003351 | -0.068393 | -0.019193 | 0.024787 | 0.071882 | -0.023992 | -0.063064 | -0.016619 | -0.002856 | -0.084232 | 0.084232 | 0.040821 | -0.095568 | -0.015814 | 0.266090 | 0.222639 | 0.100201 | 0.468956 | 0.009706 | -0.051592 | -0.096696 | -0.322658 | 0.299791 | -0.029715 | 0.000507 | 0.035254 | 0.648464 | 1.000000 | 0.755453 | 0.590891 | -0.039133 | -0.074537 | 0.152378 | 0.098322 | -0.032204 | 0.036223 | 0.462973 | 0.222932 | 0.024629 | 0.032043 | 0.094297 | -0.023305 | 0.242834 | -0.082370 | -0.217281 | 0.011016 | -0.106712 | 0.014316 | -0.021731 | 0.462973 | 0.024629 | -0.115150 | 0.254929 | 0.161456 | 0.084853 | 0.148667 | -0.339550 | 0.176086 | 0.028238 | 0.028876 | 0.152850 | 0.048011 | -0.024486 | -0.030419 | -0.000886 | 0.095978 |
| Current Assets/Total Assets | 0.098820 | 0.157005 | 0.094083 | 0.094782 | 0.094838 | 0.033821 | 0.037156 | 0.037836 | -0.007613 | 0.034865 | 0.025720 | -0.022853 | -0.049246 | 0.013544 | 0.084386 | 0.047254 | 0.047816 | 0.047968 | 0.176931 | -0.037778 | 0.000271 | 0.260501 | 0.175784 | 0.028146 | 0.018826 | 0.015090 | 0.014820 | -0.007937 | -0.077647 | -0.025684 | 0.038872 | -0.047558 | -0.029002 | 0.027031 | -0.009594 | 0.016898 | 0.109964 | -0.109964 | 0.104518 | -0.017770 | -0.028117 | 0.259411 | 0.178409 | 0.337547 | 0.497405 | -0.006896 | -0.011503 | 0.023021 | -0.386115 | 0.358480 | -0.025248 | 0.217071 | 0.032207 | 0.714868 | 0.755453 | 1.000000 | 0.416425 | 0.008243 | 0.003209 | 0.390629 | 0.002994 | -0.011064 | -0.028397 | 0.565640 | 0.289162 | 0.105331 | 0.029544 | 0.065605 | -0.029451 | 0.194399 | -0.111142 | -0.084330 | 0.017572 | -0.044491 | 0.017334 | 0.025904 | 0.565640 | 0.105331 | -0.103652 | 0.202907 | 0.133968 | -0.059645 | 0.120343 | -0.354198 | 0.133053 | 0.033760 | -0.002520 | 0.094785 | 0.038880 | 0.039449 | -0.031049 | 0.007281 | -0.015311 |
| Cash/Total Assets | 0.235314 | 0.217918 | 0.227144 | 0.241946 | 0.242353 | 0.010465 | 0.017136 | 0.017118 | 0.009002 | 0.016445 | -0.110605 | 0.013524 | 0.228027 | 0.014468 | -0.011758 | 0.185621 | 0.185631 | 0.185300 | 0.240956 | 0.253610 | 0.018125 | 0.236888 | 0.222925 | -0.005605 | 0.008048 | 0.020819 | 0.020039 | -0.001619 | -0.064079 | -0.010481 | 0.013891 | 0.167651 | -0.010622 | -0.029790 | -0.017431 | 0.019683 | -0.357605 | 0.357605 | 0.052845 | -0.126419 | -0.019245 | 0.235767 | 0.228365 | -0.135674 | 0.108064 | -0.017054 | -0.017105 | -0.012603 | -0.117399 | -0.011920 | -0.014162 | 0.048863 | 0.001549 | 0.585770 | 0.590891 | 0.416425 | 1.000000 | -0.016072 | -0.063908 | -0.216580 | 0.300463 | -0.019091 | 0.026710 | 0.243615 | 0.144209 | -0.079910 | -0.002666 | 0.096487 | -0.006558 | 0.184944 | -0.125317 | -0.234160 | 0.014822 | -0.287559 | 0.013979 | -0.010716 | 0.243615 | -0.079910 | -0.095230 | 0.480174 | 0.353270 | 0.281832 | 0.277303 | -0.286691 | 0.195652 | 0.019727 | 0.026421 | 0.241945 | 0.045075 | -0.097849 | -0.024289 | -0.007205 | 0.299732 |
| Quick Assets/Current Liability | -0.010530 | -0.009612 | -0.010014 | -0.003206 | -0.003186 | 0.000365 | 0.000296 | 0.000343 | -0.000294 | 0.000300 | -0.012904 | -0.015717 | -0.008364 | 0.055143 | -0.009854 | -0.014827 | -0.014773 | -0.014794 | -0.006744 | -0.013814 | -0.000538 | -0.007192 | -0.006873 | -0.000494 | 0.000145 | 0.000469 | 0.000461 | -0.000081 | -0.002528 | 0.075915 | 0.000358 | -0.028415 | -0.000254 | -0.000717 | 0.001858 | -0.000549 | 0.039455 | -0.039455 | -0.001740 | -0.067930 | -0.000942 | -0.007147 | -0.007288 | 0.015249 | -0.018390 | -0.000962 | -0.000803 | -0.011180 | -0.007697 | 0.105531 | -0.000356 | 0.034704 | -0.000800 | -0.033669 | -0.039133 | 0.008243 | -0.016072 | 1.000000 | -0.001525 | 0.055302 | -0.009208 | 0.001624 | -0.002008 | 0.024001 | 0.128380 | -0.077315 | 0.004567 | -0.015316 | -0.000274 | -0.012822 | -0.003691 | -0.001284 | -0.000310 | 0.001503 | 0.000109 | -0.000254 | 0.024001 | -0.077315 | -0.005047 | -0.001020 | -0.001393 | -0.018524 | -0.006788 | 0.202883 | -0.008154 | -0.001036 | -0.039538 | -0.003205 | 0.021419 | -0.061051 | -0.000876 | 0.000209 | -0.012449 |
| Cash/Current Liability | -0.046009 | -0.037468 | -0.041296 | -0.030901 | -0.031045 | 0.000301 | -0.001404 | -0.000827 | -0.003560 | -0.001112 | 0.024258 | -0.042133 | -0.023394 | -0.011063 | -0.035265 | -0.033078 | -0.032756 | -0.031013 | -0.034404 | -0.022608 | -0.001871 | -0.023966 | -0.034862 | -0.001799 | -0.002419 | -0.005475 | -0.005428 | -0.001146 | 0.006808 | -0.000999 | -0.002480 | -0.016929 | -0.000882 | 0.151987 | 0.010633 | -0.001910 | 0.082804 | -0.082804 | 0.013864 | 0.041094 | 0.011087 | -0.023862 | -0.035314 | 0.050400 | -0.047976 | 0.036912 | 0.018409 | -0.021335 | 0.017976 | -0.017293 | -0.001240 | 0.061964 | 0.004314 | -0.056213 | -0.074537 | 0.003209 | -0.063908 | -0.001525 | 1.000000 | 0.078276 | -0.024505 | 0.002423 | -0.006982 | 0.008926 | -0.015199 | 0.032776 | -0.001828 | -0.025724 | -0.001625 | -0.033007 | -0.002886 | 0.026529 | -0.000070 | -0.044147 | -0.000009 | 0.048718 | 0.008926 | 0.032776 | 0.012988 | -0.020220 | -0.008704 | -0.029893 | -0.014241 | 0.044378 | -0.034389 | -0.003604 | -0.006208 | -0.030900 | -0.008188 | 0.029935 | 0.001636 | 0.003422 | -0.038004 |
| Current Liability to Assets | -0.210256 | -0.190501 | -0.217186 | -0.198027 | -0.197842 | 0.011340 | 0.001632 | -0.002805 | -0.024357 | 0.000159 | 0.135256 | -0.046075 | -0.278218 | -0.041390 | 0.038520 | -0.198546 | -0.199086 | -0.198721 | -0.097689 | -0.147717 | -0.026361 | -0.003496 | -0.079795 | 0.021559 | -0.006721 | -0.021936 | -0.022080 | -0.016883 | -0.069315 | 0.024605 | 0.016977 | -0.138271 | -0.012141 | 0.048153 | 0.006254 | -0.045142 | 0.842583 | -0.842583 | 0.036649 | 0.229825 | 0.049283 | -0.002894 | -0.086021 | 0.452321 | 0.384428 | 0.044754 | 0.020865 | -0.059348 | -0.157522 | 0.499370 | 0.006309 | 0.078832 | 0.025642 | -0.364453 | 0.152378 | 0.390629 | -0.216580 | 0.055302 | 0.078276 | 1.000000 | -0.273704 | 0.001647 | 0.021875 | 0.422185 | -0.057977 | 0.352986 | 0.032819 | -0.198553 | -0.021716 | 0.115737 | 0.094912 | 0.162418 | -0.035566 | 0.126370 | -0.024671 | 0.032646 | 0.422185 | 0.352986 | 0.045339 | -0.037780 | -0.053637 | -0.252778 | 0.027142 | 0.351832 | -0.214085 | 0.056086 | -0.053657 | -0.198028 | -0.092725 | 0.286398 | 0.006987 | 0.021428 | -0.506360 |
| Operating Funds to Liability | 0.388151 | 0.351107 | 0.387893 | 0.246834 | 0.246781 | 0.020308 | 0.022855 | 0.020877 | -0.003476 | 0.024520 | -0.042396 | 0.035521 | 0.880562 | 0.015720 | 0.051844 | 0.195965 | 0.195903 | 0.195621 | 0.248241 | 0.415122 | 0.127459 | 0.245701 | 0.222474 | -0.023343 | 0.003664 | 0.015927 | 0.015185 | 0.003403 | 0.064132 | -0.006967 | -0.003354 | 0.412683 | 0.003547 | -0.013926 | -0.001680 | 0.265843 | -0.333429 | 0.333429 | 0.026117 | -0.093248 | -0.011000 | 0.245298 | 0.233192 | -0.130778 | -0.030660 | -0.005032 | 0.011150 | -0.014231 | 0.024049 | -0.090999 | -0.006774 | 0.118073 | -0.000400 | 0.210938 | 0.098322 | 0.002994 | 0.300463 | -0.009208 | -0.024505 | -0.273704 | 1.000000 | -0.008492 | 0.028831 | 0.062602 | 0.033619 | -0.091683 | 0.042355 | 0.264403 | 0.036077 | -0.072308 | -0.022532 | -0.077777 | 0.030672 | -0.101758 | 0.013312 | -0.008369 | 0.062602 | -0.091683 | -0.058352 | 0.273598 | 0.382088 | 0.702937 | 0.119707 | -0.139718 | 0.341188 | -0.066037 | 0.010303 | 0.246833 | 0.069212 | -0.091214 | -0.008961 | -0.001268 | 0.398769 |
| Inventory/Working Capital | -0.004447 | -0.000004 | -0.001616 | -0.035025 | -0.035085 | -0.001026 | 0.010231 | 0.009501 | 0.023107 | 0.011909 | -0.008018 | -0.015522 | -0.006801 | -0.002820 | -0.008833 | -0.005685 | -0.005589 | -0.005580 | 0.000283 | -0.009112 | -0.000279 | -0.003723 | -0.002642 | -0.000121 | 0.000081 | 0.001996 | 0.001776 | 0.060758 | 0.010147 | -0.000552 | 0.000492 | -0.005482 | -0.000486 | 0.008347 | 0.004150 | -0.000574 | 0.002064 | -0.002064 | 0.000936 | 0.003151 | -0.001142 | -0.003727 | -0.002615 | 0.008687 | -0.021439 | 0.235196 | -0.009554 | 0.000724 | 0.005493 | -0.017728 | -0.001109 | -0.013443 | -0.000525 | -0.012443 | -0.032204 | -0.011064 | -0.019091 | 0.001624 | 0.002423 | 0.001647 | -0.008492 | 1.000000 | -0.003883 | -0.010507 | -0.000220 | 0.003157 | -0.000724 | -0.001636 | -0.000249 | -0.018227 | -0.008716 | -0.001269 | 0.000116 | 0.002589 | 0.000085 | 0.001658 | -0.010507 | 0.003157 | -0.001960 | 0.005108 | -0.001131 | -0.010441 | 0.008403 | 0.002946 | 0.000992 | 0.000455 | 0.001040 | -0.035024 | 0.000176 | 0.001788 | 0.002294 | 0.023608 | -0.005719 |
| Inventory/Current Liability | 0.013330 | 0.004864 | 0.007302 | 0.035218 | 0.035478 | -0.001748 | -0.017489 | -0.015766 | -0.030963 | -0.005399 | -0.011448 | -0.023691 | 0.019862 | 0.011240 | 0.017662 | -0.015542 | -0.015540 | -0.015649 | 0.031821 | 0.028555 | 0.037494 | 0.037542 | 0.033599 | -0.001290 | 0.001564 | -0.019321 | -0.019272 | -0.001519 | -0.011164 | -0.001316 | 0.001824 | 0.020751 | -0.001161 | -0.003283 | -0.001318 | -0.002515 | 0.026934 | -0.026934 | 0.000367 | 0.007702 | -0.000778 | 0.038217 | 0.036318 | -0.012449 | -0.013638 | 0.048829 | -0.003676 | -0.051987 | 0.011197 | -0.008378 | 0.086595 | 0.005857 | -0.003665 | -0.045342 | 0.036223 | -0.028397 | 0.026710 | -0.002008 | -0.006982 | 0.021875 | 0.028831 | -0.003883 | 1.000000 | -0.000652 | -0.015531 | 0.005083 | 0.001069 | -0.016548 | -0.001274 | 0.006701 | 0.012131 | -0.029148 | -0.003880 | -0.017556 | -0.002274 | -0.001161 | -0.000652 | 0.005083 | 0.009926 | 0.023234 | 0.020917 | 0.026108 | 0.016766 | 0.125024 | 0.003639 | 0.002637 | -0.002553 | 0.035217 | 0.004009 | 0.007637 | -0.000344 | 0.005260 | 0.004448 |
| Current Liabilities/Liability | 0.052783 | 0.080401 | 0.046694 | 0.063547 | 0.064657 | 0.020520 | 0.019009 | 0.013732 | -0.011739 | 0.014602 | -0.013653 | -0.007120 | -0.068729 | 0.025009 | 0.064182 | 0.044689 | 0.044624 | 0.044538 | 0.107310 | 0.006296 | -0.027977 | 0.135996 | 0.106568 | 0.016519 | 0.011871 | -0.000929 | -0.000933 | 0.003392 | -0.044135 | -0.016129 | 0.027073 | -0.011324 | 0.004915 | 0.028012 | -0.022008 | -0.006780 | -0.082322 | 0.082322 | 0.074428 | -0.075594 | -0.004683 | 0.135385 | 0.102487 | 0.149187 | 0.321993 | 0.012795 | 0.024454 | -0.014910 | -0.285401 | 0.194031 | -0.027657 | 0.125118 | 0.023199 | 0.251502 | 0.462973 | 0.565640 | 0.243615 | 0.024001 | 0.008926 | 0.422185 | 0.062602 | -0.010507 | -0.000652 | 1.000000 | 0.071259 | 0.086931 | 0.072485 | 0.027070 | 0.013303 | 0.192153 | -0.038198 | -0.039084 | -0.026895 | -0.006543 | -0.022191 | 0.013908 | 1.000000 | 0.086931 | -0.234454 | 0.054809 | 0.046433 | -0.003067 | 0.041194 | -0.039736 | 0.066277 | 0.034229 | -0.016296 | 0.063546 | 0.017171 | -0.017927 | -0.017911 | 0.009450 | 0.098930 |
| Working Capital/Equity | 0.103819 | 0.120403 | 0.101962 | 0.067970 | 0.067921 | 0.010600 | 0.014933 | 0.016788 | 0.004236 | 0.014826 | 0.007324 | 0.005879 | 0.022609 | 0.068273 | 0.041038 | 0.070112 | 0.070331 | 0.070383 | 0.121854 | -0.009904 | 0.001783 | 0.126546 | 0.114848 | 0.008683 | -0.003457 | 0.004411 | 0.004243 | 0.001327 | -0.013873 | 0.063047 | 0.015315 | 0.036706 | -0.005779 | 0.003585 | -0.002860 | 0.008081 | -0.105008 | 0.105008 | 0.031667 | -0.535714 | -0.767778 | 0.124996 | 0.115605 | 0.040188 | 0.100832 | -0.015012 | -0.005380 | 0.016940 | -0.097150 | 0.105820 | -0.019620 | 0.086564 | 0.008668 | 0.336555 | 0.222932 | 0.289162 | 0.144209 | 0.128380 | -0.015199 | -0.057977 | 0.033619 | -0.000220 | -0.015531 | 0.071259 | 1.000000 | -0.692675 | 0.001969 | 0.076849 | -0.003233 | 0.008948 | -0.048945 | -0.049088 | 0.016589 | -0.020252 | 0.014279 | 0.007193 | 0.071259 | -0.692675 | -0.353136 | 0.065755 | 0.042518 | 0.016175 | 0.256678 | -0.176390 | 0.123817 | -0.003121 | -0.009143 | 0.067970 | 0.585971 | -0.650474 | -0.011134 | -0.002063 | 0.058512 |
| Current Liabilities/Equity | -0.142734 | -0.133816 | -0.142879 | -0.080422 | -0.080350 | 0.001860 | -0.002202 | -0.003196 | -0.008969 | -0.002689 | 0.035899 | -0.037452 | -0.087567 | -0.018880 | -0.015970 | -0.102098 | -0.102539 | -0.102431 | -0.094966 | -0.045022 | -0.006360 | -0.055461 | -0.089174 | 0.004456 | 0.005973 | -0.007517 | -0.007521 | -0.006179 | -0.036283 | -0.055279 | -0.001407 | -0.167594 | -0.003582 | 0.014777 | 0.004542 | -0.009967 | 0.343692 | -0.343692 | 0.013201 | 0.892772 | 0.622905 | -0.053169 | -0.083819 | 0.661805 | 0.124257 | 0.033322 | 0.007744 | -0.018456 | -0.043548 | 0.267816 | 0.013610 | 0.010680 | 0.006436 | -0.161580 | 0.024629 | 0.105331 | -0.079910 | -0.077315 | 0.032776 | 0.352986 | -0.091683 | 0.003157 | 0.005083 | 0.086931 | -0.692675 | 1.000000 | 0.009317 | -0.106075 | -0.006980 | 0.063345 | 0.039557 | 0.067368 | -0.009281 | 0.016699 | -0.005403 | 0.015258 | 0.086931 | 1.000000 | 0.589336 | -0.007062 | -0.016805 | -0.107778 | -0.204632 | 0.108609 | -0.150319 | 0.026494 | -0.004032 | -0.080422 | -0.749621 | 0.963908 | 0.000745 | 0.005038 | -0.156443 |
| Long-term Liability to Current Assets | 0.021508 | 0.022241 | 0.018300 | 0.000522 | 0.000178 | 0.001967 | 0.002062 | 0.002061 | -0.000641 | 0.001882 | 0.001729 | 0.008636 | 0.024446 | 0.006090 | 0.016004 | 0.011322 | 0.011623 | 0.011509 | 0.019496 | 0.031188 | -0.002441 | 0.022299 | 0.047070 | -0.001783 | 0.001546 | 0.005114 | 0.004905 | -0.000507 | 0.016918 | -0.001303 | 0.001410 | 0.023165 | -0.001150 | -0.003252 | -0.006971 | -0.002491 | -0.006936 | 0.006936 | -0.003992 | -0.007554 | -0.001512 | 0.022794 | 0.022193 | 0.017873 | 0.018985 | -0.001859 | -0.003641 | 0.009397 | -0.026209 | 0.014774 | -0.001617 | 0.017067 | 0.001754 | 0.004957 | 0.032043 | 0.029544 | -0.002666 | 0.004567 | -0.001828 | 0.032819 | 0.042355 | -0.000724 | 0.001069 | 0.072485 | 0.001969 | 0.009317 | 1.000000 | -0.001115 | -0.000927 | 0.004543 | -0.014843 | 0.001042 | -0.000288 | -0.014466 | 0.000518 | -0.001150 | 0.072485 | 0.009317 | -0.022575 | -0.002679 | 0.004564 | 0.024814 | -0.003569 | -0.007067 | 0.023702 | -0.004701 | 0.001301 | 0.000523 | 0.007529 | 0.000063 | -0.002160 | 0.003908 | 0.014437 |
| Retained Earnings to Total Assets | 0.650217 | 0.718013 | 0.673738 | 0.164579 | 0.163013 | 0.021280 | 0.036236 | 0.034573 | 0.021105 | 0.034777 | 0.083315 | 0.079153 | 0.223252 | 0.023919 | 0.212204 | 0.491365 | 0.492760 | 0.492734 | 0.492078 | 0.225557 | -0.004366 | 0.415226 | 0.473736 | -0.008902 | 0.021450 | 0.064121 | 0.064422 | 0.011965 | 0.099163 | -0.047416 | 0.025056 | 0.205944 | 0.015432 | -0.011885 | 0.000065 | -0.009129 | -0.235423 | 0.235423 | 0.010451 | -0.117913 | -0.021729 | 0.413869 | 0.483355 | -0.064067 | 0.124320 | -0.016678 | 0.010700 | -0.048772 | -0.100233 | 0.001481 | -0.016047 | 0.233582 | 0.001538 | 0.217189 | 0.094297 | 0.065605 | 0.096487 | -0.015316 | -0.025724 | -0.198553 | 0.264403 | -0.001636 | -0.016548 | 0.027070 | 0.076849 | -0.106075 | -0.001115 | 1.000000 | 0.020517 | -0.541559 | 0.002107 | -0.007653 | 0.006343 | 0.007398 | 0.010140 | -0.002027 | 0.027070 | -0.106075 | -0.073713 | 0.189642 | 0.086905 | 0.371488 | 0.105862 | -0.174742 | 0.794189 | -0.170156 | 0.013460 | 0.164583 | 0.247707 | -0.109810 | -0.013766 | 0.009603 | 0.042936 |
| Total income/Total expense | 0.023450 | 0.028873 | 0.024436 | 0.043608 | 0.043610 | 0.002047 | 0.003322 | 0.002985 | 0.001701 | 0.003094 | -0.001955 | -0.008301 | 0.026204 | -0.000202 | -0.000944 | 0.023143 | 0.023136 | 0.023117 | 0.023013 | 0.011126 | -0.000558 | 0.015660 | 0.022244 | -0.000307 | 0.000806 | 0.002600 | 0.002599 | 0.000653 | -0.020221 | -0.000709 | 0.000794 | 0.004849 | 0.003411 | -0.000314 | -0.000430 | -0.000725 | -0.027995 | 0.027995 | 0.052586 | -0.007016 | -0.000202 | 0.015657 | 0.023754 | -0.010417 | -0.014172 | -0.000684 | 0.459146 | -0.011172 | -0.006679 | -0.010976 | -0.000969 | 0.124769 | -0.000804 | -0.013297 | -0.023305 | -0.029451 | -0.006558 | -0.000274 | -0.001625 | -0.021716 | 0.036077 | -0.000249 | -0.001274 | 0.013303 | -0.003233 | -0.006980 | -0.000927 | 0.020517 | 1.000000 | -0.022219 | -0.000485 | -0.002178 | -0.000034 | 0.001812 | 0.000466 | -0.000429 | 0.013303 | -0.006980 | -0.005368 | 0.002000 | 0.000818 | 0.018124 | 0.000969 | 0.115768 | 0.027065 | -0.001687 | 0.017818 | 0.043609 | 0.006213 | -0.007383 | -0.001156 | 0.000088 | 0.031325 |
| Total expense/Assets | -0.296019 | -0.357147 | -0.322223 | 0.225479 | 0.226170 | 0.005401 | -0.004525 | -0.002803 | -0.022281 | -0.004472 | -0.249426 | -0.049438 | -0.081517 | -0.010339 | -0.081476 | -0.235573 | -0.234878 | -0.235062 | -0.177996 | -0.013919 | -0.017885 | -0.070742 | -0.156954 | 0.050982 | -0.004173 | -0.031487 | -0.032827 | -0.013009 | -0.079619 | 0.042600 | -0.015412 | -0.060663 | -0.011762 | -0.018296 | -0.013466 | -0.015686 | 0.037513 | -0.037513 | -0.018443 | 0.022642 | -0.007404 | -0.069488 | -0.174815 | 0.037055 | 0.171386 | -0.018303 | 0.040179 | 0.064959 | -0.117786 | 0.114822 | -0.012086 | -0.146846 | 0.010872 | 0.108738 | 0.242834 | 0.194399 | 0.184944 | -0.012822 | -0.033007 | 0.115737 | -0.072308 | -0.018227 | 0.006701 | 0.192153 | 0.008948 | 0.063345 | 0.004543 | -0.541559 | -0.022219 | 1.000000 | 0.026547 | -0.003683 | -0.003656 | -0.008046 | 0.000045 | -0.011092 | 0.192153 | 0.063345 | -0.003318 | -0.127591 | -0.045883 | -0.120342 | -0.090256 | -0.012881 | -0.470498 | 0.107944 | -0.004314 | 0.225478 | -0.190700 | 0.050501 | -0.017607 | -0.005122 | 0.007907 |
| Current Asset Turnover Rate | 0.005716 | -0.000869 | -0.002611 | -0.121275 | -0.121320 | 0.008117 | 0.008065 | 0.008174 | -0.003545 | 0.007518 | 0.170776 | -0.046460 | -0.007870 | -0.009759 | 0.063944 | -0.017123 | -0.016616 | -0.016690 | -0.000412 | 0.042716 | -0.010894 | 0.008130 | 0.009443 | 0.019388 | 0.011499 | 0.005805 | 0.005591 | -0.002073 | 0.000739 | 0.008879 | -0.005130 | 0.037312 | -0.002240 | -0.014514 | 0.030494 | -0.011118 | 0.124880 | -0.124880 | 0.015572 | 0.028854 | 0.001941 | 0.008815 | 0.008198 | 0.041233 | 0.353826 | -0.019486 | 0.021400 | -0.177299 | -0.042103 | 0.271284 | -0.007217 | -0.006812 | 0.013957 | -0.184527 | -0.082370 | -0.111142 | -0.125317 | -0.003691 | -0.002886 | 0.094912 | -0.022532 | -0.008716 | 0.012131 | -0.038198 | -0.048945 | 0.039557 | -0.014843 | 0.002107 | -0.000485 | 0.026547 | 1.000000 | 0.423768 | -0.003783 | 0.072658 | 0.001836 | -0.005134 | -0.038198 | 0.039557 | 0.032434 | -0.054514 | -0.030324 | 0.035322 | -0.033221 | 0.127562 | -0.005032 | -0.005085 | 0.012551 | -0.121277 | -0.015505 | 0.043730 | -0.003415 | 0.025886 | -0.101157 |
| Quick Asset Turnover Rate | -0.027280 | -0.025143 | -0.029928 | -0.129715 | -0.129747 | 0.012696 | 0.012206 | 0.012368 | -0.006365 | 0.011440 | 0.153936 | -0.034643 | -0.057385 | -0.026821 | 0.046586 | -0.043038 | -0.042926 | -0.043121 | -0.029352 | -0.028332 | -0.016476 | -0.001263 | -0.024310 | 0.003755 | 0.012411 | 0.014418 | 0.014146 | -0.007177 | 0.032340 | 0.006129 | -0.009262 | -0.022286 | -0.007291 | 0.007069 | 0.007865 | -0.016815 | 0.203370 | -0.203370 | -0.001652 | 0.063473 | 0.000955 | -0.000433 | -0.023197 | 0.112787 | 0.252558 | -0.029472 | 0.003969 | -0.014567 | -0.043957 | 0.209559 | -0.010915 | 0.047514 | 0.015921 | -0.208682 | -0.217281 | -0.084330 | -0.234160 | -0.001284 | 0.026529 | 0.162418 | -0.077777 | -0.001269 | -0.029148 | -0.039084 | -0.049088 | 0.067368 | 0.001042 | -0.007653 | -0.002178 | -0.003683 | 0.423768 | 1.000000 | -0.004605 | 0.156490 | 0.002763 | -0.007765 | -0.039084 | 0.067368 | 0.041667 | -0.095772 | -0.055200 | -0.042236 | -0.058753 | 0.127015 | -0.023834 | -0.001023 | -0.024169 | -0.129714 | -0.017511 | 0.068305 | 0.001264 | -0.009416 | -0.156954 |
| Working capitcal Turnover Rate | 0.001824 | 0.004491 | 0.002488 | 0.020451 | 0.020536 | -0.229568 | 0.090689 | 0.244911 | 0.742290 | 0.110703 | -0.003331 | -0.002909 | 0.044478 | 0.000607 | -0.002915 | 0.006438 | 0.006426 | 0.006431 | 0.002192 | -0.000240 | 0.184916 | -0.000435 | 0.001388 | 0.000023 | -0.005974 | 0.000371 | 0.000198 | -0.000385 | 0.004511 | -0.000367 | -0.001643 | 0.003031 | -0.000487 | 0.002905 | -0.000240 | 0.015300 | -0.031715 | 0.031715 | 0.031637 | -0.004406 | 0.001101 | -0.000332 | 0.001825 | -0.000594 | -0.006442 | 0.074727 | -0.000392 | -0.004528 | -0.000966 | -0.004805 | -0.039606 | -0.006489 | -0.000003 | 0.044792 | 0.011016 | 0.017572 | 0.014822 | -0.000310 | -0.000070 | -0.035566 | 0.030672 | 0.000116 | -0.003880 | -0.026895 | 0.016589 | -0.009281 | -0.000288 | 0.006343 | -0.000034 | -0.003656 | -0.003783 | -0.004605 | 1.000000 | -0.004215 | 0.948194 | 0.001218 | -0.026895 | -0.009281 | -0.001277 | 0.049727 | 0.051763 | 0.001734 | 0.033145 | -0.197348 | 0.005680 | 0.000122 | 0.005063 | 0.020450 | 0.002406 | -0.007808 | -0.000342 | -0.000004 | 0.077424 |
| Cash Turnover Rate | -0.029477 | -0.025817 | -0.030410 | -0.071579 | -0.071321 | 0.016485 | 0.015581 | 0.015792 | -0.008805 | 0.014596 | 0.040730 | 0.070369 | -0.093887 | -0.019243 | 0.044051 | -0.054775 | -0.053797 | -0.053930 | -0.034256 | -0.058218 | -0.021618 | -0.012514 | -0.029827 | -0.009126 | 0.014003 | 0.014097 | 0.013462 | -0.006509 | 0.064419 | -0.002081 | 0.018438 | -0.041483 | -0.003635 | -0.022126 | 0.002048 | -0.016561 | 0.138934 | -0.138934 | -0.014006 | 0.004608 | -0.000351 | -0.012464 | -0.031204 | 0.049393 | 0.060722 | -0.022508 | 0.014979 | 0.017862 | -0.039828 | 0.041663 | -0.014322 | 0.011304 | 0.012579 | -0.140998 | -0.106712 | -0.044491 | -0.287559 | 0.001503 | -0.044147 | 0.126370 | -0.101758 | 0.002589 | -0.017556 | -0.006543 | -0.020252 | 0.016699 | -0.014466 | 0.007398 | 0.001812 | -0.008046 | 0.072658 | 0.156490 | -0.004215 | 1.000000 | 0.003385 | -0.004830 | -0.006543 | 0.016699 | 0.000606 | -0.144087 | -0.101751 | -0.069359 | -0.074053 | 0.073448 | -0.016255 | -0.008560 | -0.023444 | -0.071578 | 0.009240 | 0.013463 | 0.000143 | -0.003773 | -0.145710 |
| Cash Flow to Sales | 0.011759 | 0.012198 | 0.011977 | -0.041559 | -0.041604 | -0.084747 | 0.233675 | 0.379952 | 0.677230 | 0.254886 | 0.003082 | 0.003677 | 0.016086 | 0.000173 | 0.004227 | 0.009424 | 0.009398 | 0.009403 | 0.007255 | 0.004107 | 0.037165 | 0.005120 | 0.006428 | -0.000182 | -0.005175 | 0.000916 | 0.000756 | -0.000180 | 0.004524 | 0.000039 | -0.000610 | 0.006674 | -0.000101 | 0.000025 | -0.000038 | 0.004189 | -0.018201 | 0.018201 | 0.008499 | -0.000476 | 0.000541 | 0.005206 | 0.006864 | 0.003459 | 0.007064 | 0.052668 | 0.000073 | -0.000165 | 0.001644 | 0.004045 | -0.019829 | 0.000762 | 0.000169 | 0.036276 | 0.014316 | 0.017334 | 0.013979 | 0.000109 | -0.000009 | -0.024671 | 0.013312 | 0.000085 | -0.002274 | -0.022191 | 0.014279 | -0.005403 | 0.000518 | 0.010140 | 0.000466 | 0.000045 | 0.001836 | 0.002763 | 0.948194 | 0.003385 | 1.000000 | -0.000049 | -0.022191 | -0.005403 | 0.000895 | 0.065010 | 0.037196 | 0.008168 | 0.043021 | -0.202509 | 0.011634 | 0.000182 | 0.003314 | -0.041560 | 0.002661 | -0.003770 | 0.000208 | -0.000083 | 0.022476 |
| Fixed Assets to Assets | -0.009192 | -0.005860 | -0.008364 | 0.003507 | 0.003524 | 0.000106 | -0.000047 | -0.000003 | -0.000355 | -0.000016 | -0.007464 | -0.009092 | -0.006933 | -0.001840 | -0.010045 | -0.003112 | -0.003094 | -0.003107 | -0.006477 | -0.023987 | -0.000311 | -0.006732 | -0.006482 | -0.000301 | 0.000042 | 0.000381 | 0.000376 | -0.000043 | 0.006570 | -0.000166 | -0.000206 | -0.034956 | -0.000147 | -0.000415 | -0.013903 | -0.000318 | 0.023418 | -0.023418 | 0.393705 | 0.015588 | -0.000599 | -0.006723 | -0.006716 | 0.035281 | -0.016059 | -0.000557 | 0.006008 | -0.007658 | -0.004931 | -0.008856 | -0.000206 | -0.004646 | 0.142653 | 0.001406 | -0.021731 | 0.025904 | -0.010716 | -0.000254 | 0.048718 | 0.032646 | -0.008369 | 0.001658 | -0.001161 | 0.013908 | 0.007193 | 0.015258 | -0.001150 | -0.002027 | -0.000429 | -0.011092 | -0.005134 | -0.007765 | 0.001218 | -0.004830 | -0.000049 | 1.000000 | 0.013908 | 0.015258 | -0.002922 | -0.003535 | -0.001378 | -0.021983 | -0.003228 | 0.001686 | -0.004380 | -0.000599 | -0.006361 | 0.003506 | -0.000700 | 0.010509 | -0.000825 | -0.002169 | -0.007758 |
| Current Liability to Liability | 0.052783 | 0.080401 | 0.046694 | 0.063547 | 0.064657 | 0.020520 | 0.019009 | 0.013732 | -0.011739 | 0.014602 | -0.013653 | -0.007120 | -0.068729 | 0.025009 | 0.064182 | 0.044689 | 0.044624 | 0.044538 | 0.107310 | 0.006296 | -0.027977 | 0.135996 | 0.106568 | 0.016519 | 0.011871 | -0.000929 | -0.000933 | 0.003392 | -0.044135 | -0.016129 | 0.027073 | -0.011324 | 0.004915 | 0.028012 | -0.022008 | -0.006780 | -0.082322 | 0.082322 | 0.074428 | -0.075594 | -0.004683 | 0.135385 | 0.102487 | 0.149187 | 0.321993 | 0.012795 | 0.024454 | -0.014910 | -0.285401 | 0.194031 | -0.027657 | 0.125118 | 0.023199 | 0.251502 | 0.462973 | 0.565640 | 0.243615 | 0.024001 | 0.008926 | 0.422185 | 0.062602 | -0.010507 | -0.000652 | 1.000000 | 0.071259 | 0.086931 | 0.072485 | 0.027070 | 0.013303 | 0.192153 | -0.038198 | -0.039084 | -0.026895 | -0.006543 | -0.022191 | 0.013908 | 1.000000 | 0.086931 | -0.234454 | 0.054809 | 0.046433 | -0.003067 | 0.041194 | -0.039736 | 0.066277 | 0.034229 | -0.016296 | 0.063546 | 0.017171 | -0.017927 | -0.017911 | 0.009450 | 0.098930 |
| Current Liability to Equity | -0.142734 | -0.133816 | -0.142879 | -0.080422 | -0.080350 | 0.001860 | -0.002202 | -0.003196 | -0.008969 | -0.002689 | 0.035899 | -0.037452 | -0.087567 | -0.018880 | -0.015970 | -0.102098 | -0.102539 | -0.102431 | -0.094966 | -0.045022 | -0.006360 | -0.055461 | -0.089174 | 0.004456 | 0.005973 | -0.007517 | -0.007521 | -0.006179 | -0.036283 | -0.055279 | -0.001407 | -0.167594 | -0.003582 | 0.014777 | 0.004542 | -0.009967 | 0.343692 | -0.343692 | 0.013201 | 0.892772 | 0.622905 | -0.053169 | -0.083819 | 0.661805 | 0.124257 | 0.033322 | 0.007744 | -0.018456 | -0.043548 | 0.267816 | 0.013610 | 0.010680 | 0.006436 | -0.161580 | 0.024629 | 0.105331 | -0.079910 | -0.077315 | 0.032776 | 0.352986 | -0.091683 | 0.003157 | 0.005083 | 0.086931 | -0.692675 | 1.000000 | 0.009317 | -0.106075 | -0.006980 | 0.063345 | 0.039557 | 0.067368 | -0.009281 | 0.016699 | -0.005403 | 0.015258 | 0.086931 | 1.000000 | 0.589336 | -0.007062 | -0.016805 | -0.107778 | -0.204632 | 0.108609 | -0.150319 | 0.026494 | -0.004032 | -0.080422 | -0.749621 | 0.963908 | 0.000745 | 0.005038 | -0.156443 |
| Equity to Long-term Liability | -0.086535 | -0.103015 | -0.083190 | -0.068810 | -0.068763 | -0.000654 | -0.007929 | -0.006326 | -0.014377 | -0.003093 | 0.024837 | -0.012524 | -0.040178 | 0.038229 | -0.054426 | -0.089004 | -0.090725 | -0.090667 | -0.114381 | -0.037488 | 0.010574 | -0.094285 | -0.110478 | -0.004258 | -0.012554 | -0.018494 | -0.018514 | -0.008501 | -0.013855 | -0.071401 | -0.011811 | -0.018924 | -0.002434 | -0.003287 | 0.008543 | -0.006328 | 0.244974 | -0.244974 | -0.021716 | 0.806889 | 0.462858 | -0.094417 | -0.111974 | 0.486097 | -0.020655 | -0.008364 | -0.007994 | 0.016432 | 0.053104 | 0.223240 | 0.031914 | -0.051603 | -0.008978 | -0.139294 | -0.115150 | -0.103652 | -0.095230 | -0.005047 | 0.012988 | 0.045339 | -0.058352 | -0.001960 | 0.009926 | -0.234454 | -0.353136 | 0.589336 | -0.022575 | -0.073713 | -0.005368 | -0.003318 | 0.032434 | 0.041667 | -0.001277 | 0.000606 | 0.000895 | -0.002922 | -0.234454 | 0.589336 | 1.000000 | -0.027925 | -0.016939 | -0.047268 | -0.218999 | 0.129499 | -0.120242 | 0.006302 | -0.001439 | -0.068809 | -0.615905 | 0.778135 | 0.002936 | -0.007973 | -0.110885 |
| Cash Flow to Total Assets | 0.262454 | 0.263591 | 0.258428 | 0.098097 | 0.098056 | 0.020918 | 0.041845 | 0.046314 | 0.033285 | 0.042328 | 0.007630 | 0.008293 | 0.224786 | -0.012599 | 0.030302 | 0.142925 | 0.142930 | 0.143017 | 0.222378 | 0.246791 | 0.015799 | 0.218511 | 0.224392 | -0.004178 | 0.008680 | 0.027411 | 0.026884 | -0.000208 | -0.104641 | -0.008237 | 0.038569 | 0.247814 | -0.006635 | -0.008936 | -0.011678 | 0.022285 | -0.066502 | 0.066502 | 0.027432 | -0.026453 | -0.013533 | 0.216378 | 0.218673 | -0.025453 | 0.093131 | 0.012486 | -0.002480 | -0.015477 | -0.052169 | 0.044739 | -0.004929 | 0.075271 | -0.002807 | 0.233958 | 0.254929 | 0.202907 | 0.480174 | -0.001020 | -0.020220 | -0.037780 | 0.273598 | 0.005108 | 0.023234 | 0.054809 | 0.065755 | -0.007062 | -0.002679 | 0.189642 | 0.002000 | -0.127591 | -0.054514 | -0.095772 | 0.049727 | -0.144087 | 0.065010 | -0.003535 | 0.054809 | -0.007062 | -0.027925 | 1.000000 | 0.712655 | 0.332471 | 0.589998 | -0.115009 | 0.254898 | -0.016782 | 0.011693 | 0.098099 | 0.050483 | -0.015893 | -0.002991 | 0.000176 | 0.034015 |
| Cash Flow to Liability | 0.159699 | 0.157065 | 0.157022 | 0.114138 | 0.114060 | 0.004669 | 0.011517 | 0.012243 | 0.011813 | 0.014059 | -0.006762 | -0.008771 | 0.364812 | -0.011341 | 0.023895 | 0.076980 | 0.076917 | 0.076871 | 0.123403 | 0.129250 | 0.156966 | 0.118816 | 0.123983 | 0.002393 | 0.003389 | 0.015736 | 0.015741 | 0.000120 | -0.064552 | -0.001937 | 0.019516 | 0.136652 | -0.005370 | -0.004089 | -0.004043 | 0.149285 | -0.076983 | 0.076983 | 0.052517 | -0.022185 | -0.004790 | 0.117421 | 0.122515 | -0.023877 | 0.037509 | 0.022651 | -0.003105 | -0.000805 | -0.025605 | 0.001048 | -0.001922 | 0.053627 | 0.000084 | 0.176265 | 0.161456 | 0.133968 | 0.353270 | -0.001393 | -0.008704 | -0.053637 | 0.382088 | -0.001131 | 0.020917 | 0.046433 | 0.042518 | -0.016805 | 0.004564 | 0.086905 | 0.000818 | -0.045883 | -0.030324 | -0.055200 | 0.051763 | -0.101751 | 0.037196 | -0.001378 | 0.046433 | -0.016805 | -0.016939 | 0.712655 | 1.000000 | 0.214116 | 0.320613 | -0.071694 | 0.142567 | -0.014123 | 0.015018 | 0.114140 | 0.026664 | -0.019213 | -0.002400 | 0.001836 | 0.109775 |
| CFO to Assets | 0.504311 | 0.443017 | 0.497042 | 0.226990 | 0.226912 | 0.026682 | 0.031813 | 0.029454 | -0.000973 | 0.030653 | -0.005426 | 0.073629 | 0.603305 | 0.011482 | 0.103101 | 0.230814 | 0.230629 | 0.230342 | 0.333636 | 0.715003 | -0.002226 | 0.338206 | 0.308200 | -0.046844 | 0.002719 | 0.014993 | 0.013393 | -0.001262 | 0.100864 | -0.018421 | -0.023001 | 0.738276 | 0.001195 | -0.023447 | 0.008880 | 0.023123 | -0.268159 | 0.268159 | -0.031758 | -0.117624 | -0.011931 | 0.337872 | 0.318991 | -0.168028 | 0.018741 | -0.026659 | 0.004308 | -0.033695 | 0.020639 | -0.057832 | -0.009930 | 0.091023 | -0.008625 | 0.131677 | 0.084853 | -0.059645 | 0.281832 | -0.018524 | -0.029893 | -0.252778 | 0.702937 | -0.010441 | 0.026108 | -0.003067 | 0.016175 | -0.107778 | 0.024814 | 0.371488 | 0.018124 | -0.120342 | 0.035322 | -0.042236 | 0.001734 | -0.069359 | 0.008168 | -0.021983 | -0.003067 | -0.107778 | -0.047268 | 0.332471 | 0.214116 | 1.000000 | 0.181298 | -0.099646 | 0.440095 | -0.076041 | 0.007769 | 0.226990 | 0.107850 | -0.098545 | -0.003771 | 0.006057 | 0.113629 |
| Cash Flow to Equity | 0.129002 | 0.112929 | 0.123622 | 0.030672 | 0.030676 | 0.014088 | 0.026245 | 0.030022 | 0.018515 | 0.027140 | 0.014722 | 0.008972 | 0.097761 | -0.006995 | 0.021563 | 0.072982 | 0.073000 | 0.073080 | 0.129661 | 0.199675 | 0.003660 | 0.126165 | 0.105591 | -0.005539 | -0.020307 | -0.006925 | -0.007324 | -0.000834 | -0.048385 | 0.013669 | 0.022967 | 0.089089 | -0.002907 | -0.005717 | -0.011266 | 0.006953 | -0.003178 | 0.003178 | 0.011582 | -0.193543 | -0.295827 | 0.132425 | 0.130916 | -0.159920 | 0.052024 | 0.005331 | -0.000835 | -0.016848 | -0.026669 | 0.084083 | -0.004661 | 0.018552 | -0.000878 | 0.101122 | 0.148667 | 0.120343 | 0.277303 | -0.006788 | -0.014241 | 0.027142 | 0.119707 | 0.008403 | 0.016766 | 0.041194 | 0.256678 | -0.204632 | -0.003569 | 0.105862 | 0.000969 | -0.090256 | -0.033221 | -0.058753 | 0.033145 | -0.074053 | 0.043021 | -0.003228 | 0.041194 | -0.204632 | -0.218999 | 0.589998 | 0.320613 | 0.181298 | 1.000000 | -0.063386 | 0.120228 | -0.024583 | 0.005883 | 0.030672 | 0.180666 | -0.231107 | -0.001471 | 0.000239 | 0.003321 |
| Current Liability to Current Assets | -0.160725 | -0.195673 | -0.162572 | -0.132650 | -0.132607 | -0.079679 | -0.138584 | -0.166453 | -0.084875 | -0.140264 | 0.015511 | -0.065204 | -0.126473 | 0.000022 | -0.053579 | -0.164367 | -0.165083 | -0.165011 | -0.154690 | -0.052804 | -0.006224 | -0.145741 | -0.148721 | -0.002813 | -0.021073 | -0.033598 | -0.033395 | -0.009174 | -0.027703 | 0.117590 | -0.008920 | -0.052971 | 0.347630 | 0.006420 | 0.014302 | -0.025966 | 0.428180 | -0.428180 | -0.021608 | 0.124908 | 0.084231 | -0.144231 | -0.153719 | 0.027166 | -0.087297 | 0.021219 | 0.068197 | -0.054892 | 0.142589 | 0.126417 | 0.027678 | -0.050780 | -0.007121 | -0.625560 | -0.339550 | -0.354198 | -0.286691 | 0.202883 | 0.044378 | 0.351832 | -0.139718 | 0.002946 | 0.125024 | -0.039736 | -0.176390 | 0.108609 | -0.007067 | -0.174742 | 0.115768 | -0.012881 | 0.127562 | 0.127015 | -0.197348 | 0.073448 | -0.202509 | 0.001686 | -0.039736 | 0.108609 | 0.129499 | -0.115009 | -0.071694 | -0.099646 | -0.063386 | 1.000000 | -0.202751 | 0.012853 | -0.049342 | -0.132652 | -0.070354 | 0.132372 | 0.022033 | 0.007652 | -0.262199 |
| Net Income to Total Assets | 0.887670 | 0.961552 | 0.912040 | 0.300143 | 0.298155 | 0.028482 | 0.048587 | 0.045390 | 0.028423 | 0.045600 | 0.071365 | 0.079169 | 0.281309 | 0.048735 | 0.231210 | 0.493776 | 0.493803 | 0.493822 | 0.691152 | 0.292252 | -0.008315 | 0.577846 | 0.671748 | 0.003064 | 0.041046 | 0.119596 | 0.119870 | 0.024257 | 0.080031 | -0.072408 | 0.062183 | 0.252716 | 0.014946 | -0.017779 | 0.004969 | -0.008056 | -0.281422 | 0.281422 | 0.016566 | -0.177781 | -0.037822 | 0.575833 | 0.683623 | -0.094770 | 0.188774 | -0.025062 | 0.011002 | -0.052668 | -0.122505 | 0.013776 | -0.014309 | 0.306356 | -0.002153 | 0.297217 | 0.176086 | 0.133053 | 0.195652 | -0.008154 | -0.034389 | -0.214085 | 0.341188 | 0.000992 | 0.003639 | 0.066277 | 0.123817 | -0.150319 | 0.023702 | 0.794189 | 0.027065 | -0.470498 | -0.005032 | -0.023834 | 0.005680 | -0.016255 | 0.011634 | -0.004380 | 0.066277 | -0.150319 | -0.120242 | 0.254898 | 0.142567 | 0.440095 | 0.120228 | -0.202751 | 1.000000 | -0.105201 | 0.011942 | 0.300146 | 0.328492 | -0.159697 | -0.010463 | 0.012746 | 0.073916 |
| Total assets to GNP price | -0.071725 | -0.098900 | -0.089088 | 0.022672 | 0.022750 | -0.003338 | -0.004243 | -0.003786 | -0.000408 | -0.004623 | -0.025524 | -0.020166 | -0.052766 | -0.007519 | -0.023643 | -0.059970 | -0.059780 | -0.059826 | -0.033509 | -0.023591 | -0.001272 | -0.032299 | -0.028837 | 0.002692 | 0.000063 | 0.001185 | 0.001166 | -0.000307 | -0.038909 | -0.000679 | 0.006583 | -0.041889 | -0.000599 | -0.001694 | -0.000586 | -0.001298 | 0.041055 | -0.041055 | -0.003589 | 0.011083 | -0.002446 | -0.032403 | -0.029308 | 0.027074 | 0.041944 | 0.113731 | -0.001897 | 0.009172 | -0.013477 | 0.059178 | -0.000843 | -0.017804 | -0.001892 | -0.008452 | 0.028238 | 0.033760 | 0.019727 | -0.001036 | -0.003604 | 0.056086 | -0.066037 | 0.000455 | 0.002637 | 0.034229 | -0.003121 | 0.026494 | -0.004701 | -0.170156 | -0.001687 | 0.107944 | -0.005085 | -0.001023 | 0.000122 | -0.008560 | 0.000182 | -0.000599 | 0.034229 | 0.026494 | 0.006302 | -0.016782 | -0.014123 | -0.076041 | -0.024583 | 0.012853 | -0.105201 | 1.000000 | -0.000584 | 0.022673 | -0.040217 | 0.021982 | -0.001881 | 0.000239 | 0.014871 |
| No-credit Interval | 0.008135 | 0.011463 | 0.007523 | 0.004205 | 0.004038 | 0.000199 | -0.000075 | -0.001091 | -0.000637 | -0.000556 | 0.006497 | -0.006838 | 0.013642 | 0.003175 | 0.011488 | 0.014303 | 0.014424 | 0.014335 | 0.003791 | 0.002721 | 0.027256 | 0.001169 | 0.008267 | -0.000764 | -0.000180 | 0.002108 | 0.002026 | 0.002108 | -0.013174 | -0.010080 | -0.000310 | 0.008558 | 0.008178 | -0.014929 | -0.000076 | 0.002556 | -0.050218 | 0.050218 | 0.003389 | -0.004183 | 0.001531 | 0.000975 | 0.007939 | -0.021709 | 0.007536 | 0.004030 | 0.004210 | -0.010280 | -0.007192 | -0.019169 | 0.033598 | -0.009823 | -0.000702 | 0.038209 | 0.028876 | -0.002520 | 0.026421 | -0.039538 | -0.006208 | -0.053657 | 0.010303 | 0.001040 | -0.002553 | -0.016296 | -0.009143 | -0.004032 | 0.001301 | 0.013460 | 0.017818 | -0.004314 | 0.012551 | -0.024169 | 0.005063 | -0.023444 | 0.003314 | -0.006361 | -0.016296 | -0.004032 | -0.001439 | 0.011693 | 0.015018 | 0.007769 | 0.005883 | -0.049342 | 0.011942 | -0.000584 | 1.000000 | 0.004203 | 0.000127 | -0.003724 | -0.008812 | 0.001027 | 0.050609 |
| Gross Profit to Sales | 0.334721 | 0.326971 | 0.333750 | 1.000000 | 0.999518 | 0.005746 | 0.032494 | 0.027176 | 0.051437 | 0.029431 | -0.206354 | -0.016975 | 0.341186 | 0.017198 | 0.067971 | 0.144662 | 0.145032 | 0.145058 | 0.256723 | 0.163190 | 0.117044 | 0.267946 | 0.247791 | 0.014172 | 0.022866 | 0.054639 | 0.053430 | 0.009122 | 0.016014 | -0.017450 | 0.026544 | 0.122675 | 0.024946 | 0.001381 | -0.002365 | -0.022363 | -0.245461 | 0.245461 | 0.006020 | -0.085732 | -0.022258 | 0.267412 | 0.248106 | -0.086720 | -0.099661 | 0.082343 | 0.022529 | 0.047666 | 0.001240 | -0.136158 | 0.019021 | 0.224980 | -0.006952 | 0.246308 | 0.152850 | 0.094785 | 0.241945 | -0.003205 | -0.030900 | -0.198028 | 0.246833 | -0.035024 | 0.035217 | 0.063546 | 0.067970 | -0.080422 | 0.000523 | 0.164583 | 0.043609 | 0.225478 | -0.121277 | -0.129714 | 0.020450 | -0.071578 | -0.041560 | 0.003506 | 0.063546 | -0.080422 | -0.068809 | 0.098099 | 0.114140 | 0.226990 | 0.030672 | -0.132652 | 0.300146 | 0.022673 | 0.004203 | 1.000000 | 0.075303 | -0.085434 | -0.011806 | -0.001169 | 0.120027 |
| Net Income to Stockholder's Equity | 0.274287 | 0.291744 | 0.280617 | 0.075304 | 0.074891 | 0.006216 | 0.011343 | 0.010648 | 0.007693 | 0.011191 | 0.029733 | 0.021490 | 0.057933 | 0.010950 | 0.077920 | 0.148693 | 0.148872 | 0.148906 | 0.222961 | 0.074250 | -0.001104 | 0.183601 | 0.218389 | 0.001952 | -0.007570 | 0.020203 | 0.020273 | 0.006638 | 0.032565 | 0.068054 | 0.019467 | 0.162473 | 0.002489 | -0.002374 | 0.003604 | -0.000700 | -0.123986 | 0.123986 | 0.008679 | -0.806478 | -0.352618 | 0.181150 | 0.215690 | -0.422238 | 0.041242 | -0.006587 | 0.004093 | -0.026296 | -0.013604 | -0.034699 | -0.006549 | 0.096235 | 0.000679 | 0.109766 | 0.048011 | 0.038880 | 0.045075 | 0.021419 | -0.008188 | -0.092725 | 0.069212 | 0.000176 | 0.004009 | 0.017171 | 0.585971 | -0.749621 | 0.007529 | 0.247707 | 0.006213 | -0.190700 | -0.015505 | -0.017511 | 0.002406 | 0.009240 | 0.002661 | -0.000700 | 0.017171 | -0.749621 | -0.615905 | 0.050483 | 0.026664 | 0.107850 | 0.180666 | -0.070354 | 0.328492 | -0.040217 | 0.000127 | 0.075303 | 1.000000 | -0.791836 | -0.000093 | 0.005147 | 0.029622 |
| Liability to Equity | -0.143629 | -0.141039 | -0.142838 | -0.085434 | -0.085407 | 0.001541 | -0.004043 | -0.004390 | -0.011899 | -0.002996 | 0.034809 | -0.035363 | -0.080773 | -0.003423 | -0.030002 | -0.110850 | -0.111797 | -0.111682 | -0.114114 | -0.047298 | -0.002132 | -0.077102 | -0.107727 | 0.001687 | 0.000537 | -0.011685 | -0.011705 | -0.007433 | -0.033052 | -0.068649 | -0.005198 | -0.133686 | -0.003741 | 0.009645 | 0.006926 | -0.010045 | 0.349250 | -0.349250 | 0.001791 | 0.955857 | 0.621808 | -0.075374 | -0.104149 | 0.670373 | 0.084243 | 0.023753 | 0.003221 | -0.008359 | -0.015134 | 0.273553 | 0.022624 | -0.011415 | 0.002286 | -0.177645 | -0.024486 | 0.039449 | -0.097849 | -0.061051 | 0.029935 | 0.286398 | -0.091214 | 0.001788 | 0.007637 | -0.017927 | -0.650474 | 0.963908 | 0.000063 | -0.109810 | -0.007383 | 0.050501 | 0.043730 | 0.068305 | -0.007808 | 0.013463 | -0.003770 | 0.010509 | -0.017927 | 0.963908 | 0.778135 | -0.015893 | -0.019213 | -0.098545 | -0.231107 | 0.132372 | -0.159697 | 0.021982 | -0.003724 | -0.085434 | -0.791836 | 1.000000 | 0.002119 | 0.001487 | -0.159654 |
| Degree of Financial Leverage (DFL) | -0.016575 | -0.011515 | -0.014663 | -0.011806 | -0.011268 | 0.000935 | 0.000855 | 0.000927 | -0.000556 | 0.000774 | 0.013577 | -0.013945 | -0.006348 | -0.007301 | -0.014962 | -0.021860 | -0.021781 | -0.021674 | -0.018829 | -0.006200 | -0.001140 | -0.015936 | -0.017885 | -0.000672 | 0.001247 | 0.002030 | 0.002014 | 0.000014 | 0.005520 | -0.000697 | -0.000310 | 0.003717 | -0.000574 | -0.000083 | 0.016829 | -0.001262 | 0.017982 | -0.017982 | -0.004319 | 0.007260 | -0.001061 | -0.016052 | -0.018358 | -0.006647 | -0.020769 | -0.002111 | -0.001342 | 0.001189 | 0.018565 | -0.009233 | -0.001084 | -0.006094 | -0.001859 | -0.036716 | -0.030419 | -0.031049 | -0.024289 | -0.000876 | 0.001636 | 0.006987 | -0.008961 | 0.002294 | -0.000344 | -0.017911 | -0.011134 | 0.000745 | -0.002160 | -0.013766 | -0.001156 | -0.017607 | -0.003415 | 0.001264 | -0.000342 | 0.000143 | 0.000208 | -0.000825 | -0.017911 | 0.000745 | 0.002936 | -0.002991 | -0.002400 | -0.003771 | -0.001471 | 0.022033 | -0.010463 | -0.001881 | -0.008812 | -0.011806 | -0.000093 | 0.002119 | 1.000000 | 0.016513 | -0.016739 |
| Interest Coverage Ratio (Interest expense to EBIT) | 0.010573 | 0.013372 | 0.011473 | -0.001167 | -0.001158 | 0.000393 | 0.000984 | 0.000957 | 0.001024 | 0.000798 | 0.006232 | -0.012160 | 0.001262 | -0.000779 | 0.030275 | -0.002175 | -0.002358 | -0.002328 | 0.008039 | 0.001358 | -0.000053 | 0.006331 | 0.008143 | -0.000327 | 0.004576 | 0.005373 | 0.005329 | 0.001086 | 0.001723 | -0.000446 | 0.001498 | 0.005525 | -0.000150 | 0.001867 | -0.034321 | -0.000431 | 0.012571 | -0.012571 | -0.000342 | 0.001776 | -0.000620 | 0.006258 | 0.008172 | 0.006764 | 0.019358 | -0.000497 | 0.000892 | -0.011491 | 0.000537 | 0.017437 | -0.002667 | 0.003220 | -0.000957 | -0.008912 | -0.000886 | 0.007281 | -0.007205 | 0.000209 | 0.003422 | 0.021428 | -0.001268 | 0.023608 | 0.005260 | 0.009450 | -0.002063 | 0.005038 | 0.003908 | 0.009603 | 0.000088 | -0.005122 | 0.025886 | -0.009416 | -0.000004 | -0.003773 | -0.000083 | -0.002169 | 0.009450 | 0.005038 | -0.007973 | 0.000176 | 0.001836 | 0.006057 | 0.000239 | 0.007652 | 0.012746 | 0.000239 | 0.001027 | -0.001169 | 0.005147 | 0.001487 | 0.016513 | 1.000000 | -0.008339 |
| Equity to Liability | 0.052416 | 0.057887 | 0.056430 | 0.120029 | 0.120196 | -0.017071 | -0.014559 | -0.010900 | 0.012293 | -0.011299 | -0.120763 | -0.045244 | 0.331710 | 0.028945 | -0.053148 | 0.098434 | 0.098721 | 0.098390 | 0.036722 | 0.052117 | 0.233203 | -0.009316 | 0.028185 | -0.002302 | 0.001725 | 0.001253 | 0.001383 | 0.014498 | -0.015962 | -0.010685 | -0.002454 | 0.045546 | 0.010228 | -0.017084 | -0.012626 | 0.338898 | -0.625879 | 0.625879 | 0.104344 | -0.146012 | -0.018260 | -0.009528 | 0.031292 | -0.205183 | -0.198485 | 0.010629 | 0.002744 | 0.006726 | 0.044538 | -0.226456 | -0.011295 | -0.030261 | 0.000081 | 0.369149 | 0.095978 | -0.015311 | 0.299732 | -0.012449 | -0.038004 | -0.506360 | 0.398769 | -0.005719 | 0.004448 | 0.098930 | 0.058512 | -0.156443 | 0.014437 | 0.042936 | 0.031325 | 0.007907 | -0.101157 | -0.156954 | 0.077424 | -0.145710 | 0.022476 | -0.007758 | 0.098930 | -0.156443 | -0.110885 | 0.034015 | 0.109775 | 0.113629 | 0.003321 | -0.262199 | 0.073916 | 0.014871 | 0.050609 | 0.120027 | 0.029622 | -0.159654 | -0.016739 | -0.008339 | 1.000000 |
# plotting correlation heatmap
plt.figure(figsize = (35,30))
mask = np.triu(np.ones_like(corr))
sns.heatmap(corr, mask=mask,cmap='Blues',square=True, linewidths=.5) #, cbar_kws={"shrink": .5}
plt.show()
# identifying features with greater than 0.70 correlation
correlated_features = []
for i in range(len(corr.columns)):
for j in range(i):
if abs(corr.iloc[i, j]) > 0.70:
colnamei = corr.columns[i]
colnamej = corr.columns[j]
correlated_features.append([colnamei, colnamej])
len(correlated_features)
77
|
There seems to be a high - extreme correlation between some of the independent variables
|
|
In order to perform Bivariate and Multivariate Analysis, we perform the following:
1. Plot barplots to compare medians of Bankrupt and healthy firms for all features 2. Identify pairs with extremly high correlation 3. Plotting a scatterplot of those pairs with hue as "Bankrupt?" |
# plotting barplots to compare medians of Bankrupt and healthy firms for all features
fig, ax = plt.subplots(nrows = 10, ncols= 10, figsize=(100,95))
for variable, subplot in zip(df1.columns[1:], ax.flatten()):
z = sns.barplot(df1['Bankrupt?'],df1[variable], estimator=np.median, ax=subplot, ci = 0, palette="Blues_d")
plt.tight_layout(pad=4.0)
# looking at values in Liability-Assets Flag
df1[' Liability-Assets Flag'].value_counts()
0 6811 1 8 Name: Liability-Assets Flag, dtype: int64
# removing Liability-Assets Flag as it is insignificant for further analysis
df1 = df1.drop(' Liability-Assets Flag',axis = 1)
# identifying features with extremly high correlation
correlated_features = []
for i in range(len(corr.columns)):
for j in range(i):
if abs(corr.iloc[i, j]) > 0.90:
colnamei = corr.columns[i]
colnamej = corr.columns[j]
correlated_features.append([colnamei, colnamej])
# plotting identified features with hue as 'Bankrupt?'
fig, ax = plt.subplots(5,6,figsize=(30,25))
for variable, subplot in zip(correlated_features, ax.flatten()):
z = sns.scatterplot(df1[variable[0]],df1[variable[1]], hue = df['Bankrupt?'], ax = subplot)
plt.tight_layout(pad=4.0)
plt.show()
|
We see multiple relatioships from our Bivariate and multivariate analysis, we analyse these relationships in details in the next section.
|
As most of the data is ranged between 0 - 1, we can use min-max scalling for columns that are not contained within this range if needed during modelling, it should be kept in mind that this will not help treat outliers/skewness but only bring everything to scale.
We can apply log(x+1) transformation on the columns with skewness more than 1 or less than -1, we're suggesting log(x+1) because most of the values are really small to begin with and applying direct log might not yield desirable results
We found that alot of the independent variables are correlated with each other pointing to moderate-high multicolinearity, we identified 77 distinct pairs of independent features that had correlation greater than 0.7 or less than -0.7.
We plotted bar charts with Bankrupt? on x-axis and median of each independent feature on y-axis separately to see whether there were any clear relationships we could identify, we found that
Research and development expense rate,Tax rate (A),Quick Ratio,Cash/Total Assets,Quick Assets/Current Liability and Cash/Current Liability tend to go bankrupt.Total debt/Total net worth, Inventory Turnover Rate (times),Fixed Assets Turnover Frequency,Allocation rate per person,Current Liability to Assets,Long-term Liability to Current Assets,Quick Asset Turnover Rate,Current Liability to Current Assets and Total assets to GNP price1 tend to go bankrupt.We plotted scatterplots for extremely correlated features(corr > .90) with hue as bankruptcy to see whether there are any evident clusters, we found that
Nett value per share(A,B&C) tend to go bankruptPersistent EPS in the Last Four Seasons,Per Share Net profit before tax (Yuan ¥),Net profit before tax/Paid-in capital tend to go bankrupt.
|
Data preparation is a process of cleansing data to get it ready for modelling.
We will follow the steps mentioned below to prepare the dataset:- 1. Remove redundant columns based on Domain knowledge 2. Remove outliers using IQR method 3. Impute null values created due to outliers removal 4. Reduce mutlicolinearity using Variance Inflation Factor 5. Transform skewed variables 6. Train - Test split |
# copying df1
df2 = df1.copy()
# eliminating redundant columns based on domain knowledge
remove = [' ROA(A) before interest and % after tax',' ROA(B) before interest and depreciation after tax',
' Operating Gross Margin',' Realized Sales Gross Margin',' Net Value Per Share (A)',
' Net Value Per Share (C)',' Continuous interest rate (after tax)',
' Per Share Net profit before tax (Yuan ¥)',' Operating Profit Per Share (Yuan ¥)',
' Total Asset Turnover',' Debt ratio %', ' Current Liability to Liability',' Current Liability to Equity']
df2 = df2.drop(remove,axis=1)
df2.head()
| Bankrupt? | ROA(C) before interest and depreciation before interest | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0.370594 | 0.998969 | 0.796887 | 0.808809 | 0.302646 | 0.000126 | 0.000000 | 0.458143 | 0.000725 | 0.000000 | 0.147950 | 0.169141 | 0.311664 | 0.017560 | 0.022102 | 0.848195 | 0.688979 | 0.688979 | 0.217535 | 4980000000.000000 | 0.000327 | 0.263100 | 0.363725 | 0.002259 | 0.001208 | 0.629951 | 0.021266 | 0.792424 | 0.005024 | 0.390284 | 0.006479 | 0.095885 | 0.137757 | 0.398036 | 0.001814 | 0.003487 | 0.000182 | 0.000117 | 0.032903 | 0.034164 | 0.392913 | 0.037135 | 0.672775 | 0.166673 | 0.190643 | 0.004094 | 0.001997 | 0.000147 | 0.147308 | 0.334015 | 0.276920 | 0.001036 | 0.676269 | 0.721275 | 0.339077 | 0.025592 | 0.903225 | 0.002022 | 0.064856 | 701000000.000000 | 6550000000.000000 | 0.593831 | 458000000.000000 | 0.671568 | 0.424206 | 0.126549 | 0.637555 | 0.458609 | 0.520382 | 0.312905 | 0.118250 | 0.716845 | 0.009219 | 0.622879 | 0.601453 | 0.827890 | 0.290202 | 0.026601 | 0.564050 | 0.016469 |
| 1 | 1 | 0.464291 | 0.998946 | 0.797380 | 0.809301 | 0.303556 | 0.000290 | 0.000000 | 0.461867 | 0.000647 | 0.000000 | 0.182251 | 0.208944 | 0.318137 | 0.021144 | 0.022080 | 0.848088 | 0.689693 | 0.689702 | 0.217620 | 6110000000.000000 | 0.000443 | 0.264516 | 0.376709 | 0.006016 | 0.004039 | 0.635172 | 0.012502 | 0.828824 | 0.005059 | 0.376760 | 0.005835 | 0.093743 | 0.168962 | 0.397725 | 0.001286 | 0.004917 | 9360000000.000000 | 719000000.000000 | 0.025484 | 0.006889 | 0.391590 | 0.012335 | 0.751111 | 0.127236 | 0.182419 | 0.014948 | 0.004136 | 0.001384 | 0.056963 | 0.341106 | 0.289642 | 0.005210 | 0.308589 | 0.731975 | 0.329740 | 0.023947 | 0.931065 | 0.002226 | 0.025516 | 0.000107 | 7700000000.000000 | 0.593916 | 2490000000.000000 | 0.671570 | 0.468828 | 0.120916 | 0.641100 | 0.459001 | 0.567101 | 0.314163 | 0.047775 | 0.795297 | 0.008323 | 0.623652 | 0.610237 | 0.839969 | 0.283846 | 0.264577 | 0.570175 | 0.020794 |
| 2 | 1 | 0.426071 | 0.998857 | 0.796403 | 0.808388 | 0.302035 | 0.000236 | 25500000.000000 | 0.458521 | 0.000790 | 0.000000 | 0.177911 | 0.180581 | 0.307102 | 0.005944 | 0.022760 | 0.848094 | 0.689463 | 0.689470 | 0.217601 | 7280000000.000000 | 0.000396 | 0.264184 | 0.368913 | 0.011543 | 0.005348 | 0.629631 | 0.021248 | 0.792484 | 0.005100 | 0.379093 | 0.006562 | 0.092318 | 0.148036 | 0.406580 | 0.001495 | 0.004227 | 65000000.000000 | 2650000000.000000 | 0.013387 | 0.028997 | 0.381968 | 0.141016 | 0.829502 | 0.340201 | 0.602806 | 0.000991 | 0.006302 | 5340000000.000000 | 0.098162 | 0.336731 | 0.277456 | 0.013879 | 0.446027 | 0.742729 | 0.334777 | 0.003715 | 0.909903 | 0.002060 | 0.021387 | 0.001791 | 0.001023 | 0.594502 | 761000000.000000 | 0.671571 | 0.276179 | 0.117922 | 0.642765 | 0.459254 | 0.538491 | 0.314515 | 0.025346 | 0.774670 | 0.040003 | 0.623841 | 0.601449 | 0.836774 | 0.290189 | 0.026555 | 0.563706 | 0.016474 |
| 3 | 1 | 0.399844 | 0.998700 | 0.796967 | 0.808966 | 0.303350 | 0.000108 | 0.000000 | 0.465705 | 0.000449 | 0.000000 | 0.154187 | 0.193722 | 0.321674 | 0.014368 | 0.022046 | 0.848005 | 0.689110 | 0.689110 | 0.217568 | 4880000000.000000 | 0.000382 | 0.263371 | 0.384077 | 0.004194 | 0.002896 | 0.630228 | 0.009572 | 0.848535 | 0.005047 | 0.379743 | 0.005366 | 0.077727 | 0.147561 | 0.397925 | 0.001966 | 0.003215 | 7130000000.000000 | 9150000000.000000 | 0.028065 | 0.015463 | 0.378497 | 0.021320 | 0.725754 | 0.161575 | 0.225815 | 0.018851 | 0.002961 | 0.001011 | 0.098715 | 0.348716 | 0.276580 | 0.003540 | 0.615848 | 0.729825 | 0.331509 | 0.022165 | 0.906902 | 0.001831 | 0.024161 | 8140000000.000000 | 6050000000.000000 | 0.593889 | 2030000000.000000 | 0.671519 | 0.559144 | 0.120760 | 0.579039 | 0.448518 | 0.604105 | 0.302382 | 0.067250 | 0.739555 | 0.003252 | 0.622929 | 0.583538 | 0.834697 | 0.281721 | 0.026697 | 0.564663 | 0.023982 |
| 4 | 1 | 0.465022 | 0.998973 | 0.797366 | 0.809304 | 0.303475 | 7890000000.000000 | 0.000000 | 0.462746 | 0.000686 | 0.000000 | 0.167502 | 0.212537 | 0.319162 | 0.029690 | 0.022096 | 0.848258 | 0.689697 | 0.689697 | 0.217626 | 5510000000.000000 | 0.000439 | 0.265218 | 0.379690 | 0.006022 | 0.003727 | 0.636055 | 0.005150 | 0.893491 | 0.005303 | 0.375025 | 0.006624 | 0.096927 | 0.167461 | 0.400079 | 0.001449 | 0.004367 | 0.000163 | 0.000294 | 0.040161 | 0.058111 | 0.394371 | 0.023988 | 0.751822 | 0.260330 | 0.358380 | 0.014161 | 0.004275 | 0.000680 | 0.110195 | 0.344639 | 0.287913 | 0.004869 | 0.975007 | 0.732000 | 0.330726 | 0.000000 | 0.913850 | 0.002224 | 0.026385 | 6680000000.000000 | 5050000000.000000 | 0.593915 | 824000000.000000 | 0.671563 | 0.309555 | 0.110933 | 0.622374 | 0.454411 | 0.578469 | 0.311567 | 0.047725 | 0.795016 | 0.003878 | 0.623521 | 0.598782 | 0.839973 | 0.278514 | 0.024752 | 0.575617 | 0.035490 |
# checking the shape
df2.shape
(6819, 81)
|
Lets remove outliers using the IQR method.
|
# eliminating outliers using IQR
for i in df2.drop('Bankrupt?',axis = 1).columns:
q1 = df2[i].quantile(0.25)
q3 = df2[i].quantile(0.75)
iqr = q3 - q1
upper = q3 + (iqr*1.5)
lower = q1 - (iqr*1.5)
df2[i] = df2[i][(df2[i]<upper) & (df2[i]>lower)]
# looking at the boxplots again
plt.figure(figsize = (25,20))
ax =sns.boxplot(data = df2, orient="h",palette='Blues_d')
ax.set_title('Outlier Analysis using Boxplots', fontsize = 20)
ax.set(xscale="log")
plt.show()
|
As we can see all the outliers have been removed. We can see new outliers which is expected.
|
|
Lets treat the null values resulting from removal of outliers.
We will follow the steps mentioned below to treat null values:- 1. Find features with high null values and see if they can be dropped 2. Transform the skewed columns using square root transformation 3. Scale the data to get it ready for KNN imputation. 4. Impute null values using KNNImputer 5. Plot the medians to see if there's any deviation |
# finding features with high null values
nulls = df2.isnull().sum()/len(df2) * 100
nulls[nulls>20]
Total Asset Growth Rate 20.252236 Fixed Assets Turnover Frequency 20.794838 Current Asset Turnover Rate 20.516205 Degree of Financial Leverage (DFL) 22.041355 Interest Coverage Ratio (Interest expense to EBIT) 20.838833 dtype: float64
#copying the dataset
df3 = df2.copy()
# transforming skewed columns using square root transformatin
for i in df3.columns[1:]:
if abs(df3[i].skew()) > 1:
print(i,'skewness before tansformation =',df3[i].skew())
df3[i] = np.sqrt(df3[i])
print(i,'skewness after tansformation =',df3[i].skew())
print()
Operating Expense Rate skewness before tansformation = 1.2486733165601693 Operating Expense Rate skewness after tansformation = 0.9704917487673127 Research and development expense rate skewness before tansformation = 1.2223584686011266 Research and development expense rate skewness after tansformation = 0.5230365802827721 Long-term fund suitability ratio (A) skewness before tansformation = 1.516058058822089 Long-term fund suitability ratio (A) skewness after tansformation = 1.3681217333454307 Borrowing dependency skewness before tansformation = 1.0178093388205238 Borrowing dependency skewness after tansformation = 1.0083829140276173 Contingent liabilities/Net worth skewness before tansformation = 2.064014083541369 Contingent liabilities/Net worth skewness after tansformation = 2.035971865396791 Inventory Turnover Rate (times) skewness before tansformation = 1.1372886202283368 Inventory Turnover Rate (times) skewness after tansformation = 0.8098361672667492 Fixed Assets Turnover Frequency skewness before tansformation = 3.020299835272531 Fixed Assets Turnover Frequency skewness after tansformation = 1.7504193115139708 Revenue per person skewness before tansformation = 1.2875627466044757 Revenue per person skewness after tansformation = 0.5794751683400569 Allocation rate per person skewness before tansformation = 1.1603820080135723 Allocation rate per person skewness after tansformation = 0.25256192967534313 Cash/Total Assets skewness before tansformation = 1.1814214094550086 Cash/Total Assets skewness after tansformation = 0.4223772666031243 Cash/Current Liability skewness before tansformation = 1.4427560380681825 Cash/Current Liability skewness after tansformation = 0.6420693938643052 Current Liabilities/Equity skewness before tansformation = 1.0079451962626926 Current Liabilities/Equity skewness after tansformation = 0.9992533129091725 Long-term Liability to Current Assets skewness before tansformation = 1.467649868590847 Long-term Liability to Current Assets skewness after tansformation = 0.5862615950705373 Current Asset Turnover Rate skewness before tansformation = 2.5805916938465274 Current Asset Turnover Rate skewness after tansformation = 1.757822993502295 Quick Asset Turnover Rate skewness before tansformation = 1.1373534593747237 Quick Asset Turnover Rate skewness after tansformation = 0.8859087564507104 Equity to Long-term Liability skewness before tansformation = 1.2837777636267995 Equity to Long-term Liability skewness after tansformation = 1.258798149619653 Total assets to GNP price skewness before tansformation = 1.4971928329841655 Total assets to GNP price skewness after tansformation = 0.7488878333733503 Equity to Liability skewness before tansformation = 1.1335923560672716 Equity to Liability skewness after tansformation = 0.6885894447832723
# performing min-max normalisation
from sklearn.preprocessing import MinMaxScaler
minmax = MinMaxScaler()
df3.iloc[:,1:] = minmax.fit_transform(df3.iloc[:,1:])
df3.head()
| Bankrupt? | ROA(C) before interest and depreciation before interest | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | nan | 0.374509 | nan | nan | nan | 0.000000 | 0.000000 | 0.283229 | 0.706628 | 0.000000 | 0.124745 | nan | 0.235151 | 0.190110 | 0.481353 | 0.757374 | 0.182271 | 0.181879 | 0.111907 | 0.422145 | nan | 0.113307 | 0.145023 | 0.077072 | 0.051381 | 0.052024 | nan | 0.203203 | 0.207740 | nan | nan | 0.372072 | 0.019208 | 0.306109 | 0.707278 | 0.220471 | 0.000000 | 0.115211 | 0.441558 | 0.680506 | 0.388253 | nan | 0.040934 | 0.169899 | 0.190643 | 0.107858 | 0.081456 | 0.071253 | 0.673487 | 0.285579 | 0.302838 | 0.044862 | 0.616485 | nan | nan | nan | nan | 0.156989 | 0.954206 | nan | 0.809321 | nan | 0.045800 | 0.402994 | 0.532432 | nan | 0.410855 | 0.427092 | 0.181028 | 0.369944 | nan | nan | 0.883582 | 0.019007 | 0.390126 | nan | nan | nan | nan | 0.415655 |
| 1 | 1 | 0.323408 | 0.328100 | 0.367647 | 0.358432 | 0.565394 | 0.000000 | 0.000000 | 0.382014 | 0.630604 | 0.000000 | 0.456584 | 0.315015 | 0.384052 | 0.228918 | 0.418942 | 0.563321 | 0.655577 | 0.661253 | 0.614004 | 0.552480 | 0.381382 | 0.675826 | 0.416019 | 0.205263 | 0.171846 | nan | 0.670153 | 0.342927 | 0.215339 | 0.453845 | 0.483273 | 0.345047 | 0.370899 | 0.293577 | 0.501582 | 0.310840 | 0.967955 | nan | 0.341991 | 0.305572 | 0.353012 | 0.627259 | 0.298758 | 0.129699 | 0.182419 | 0.206085 | 0.168726 | 0.218375 | 0.260431 | 0.374931 | nan | 0.225597 | 0.180904 | 0.291798 | 0.288100 | nan | 0.373919 | 0.358384 | 0.375409 | 0.341744 | 0.877496 | 0.305566 | 0.249000 | 0.429871 | 0.588439 | 0.654864 | 0.440620 | 0.440867 | 0.380992 | 0.436871 | 0.693546 | 0.363867 | 0.839542 | 0.382399 | 0.554549 | 0.358727 | 0.676055 | nan | nan | 0.467063 |
| 2 | 1 | 0.161290 | 0.151004 | nan | nan | nan | 0.000000 | 0.054421 | 0.293241 | 0.769981 | 0.000000 | 0.414594 | 0.020608 | 0.130187 | 0.064352 | nan | 0.574425 | 0.502843 | 0.507239 | 0.504907 | 0.687428 | 0.180587 | 0.543995 | 0.253305 | 0.393813 | 0.227501 | nan | nan | 0.203435 | 0.224359 | 0.601556 | nan | 0.327065 | 0.135050 | 0.650624 | 0.583070 | 0.267221 | 0.080663 | nan | 0.179654 | 0.626935 | 0.096687 | nan | 0.556764 | 0.346787 | 0.602806 | 0.053062 | 0.257104 | nan | 0.448792 | 0.319807 | 0.642068 | 0.600999 | 0.343724 | 0.838049 | 0.692423 | 0.406218 | nan | 0.194956 | 0.314668 | nan | 0.000000 | nan | 0.076100 | 0.446071 | 0.346640 | 0.461411 | 0.454597 | 0.449760 | 0.258535 | 0.455611 | 0.367951 | 0.189685 | nan | 0.471413 | 0.390052 | 0.002340 | nan | nan | nan | 0.415723 |
| 3 | 1 | 0.050041 | nan | nan | nan | 0.129771 | 0.000000 | 0.000000 | 0.483825 | 0.437622 | 0.000000 | 0.185079 | 0.157017 | 0.465419 | 0.155559 | 0.322671 | 0.414024 | 0.268563 | 0.268213 | 0.307487 | 0.410611 | 0.120284 | 0.221056 | 0.569790 | 0.143094 | 0.123225 | 0.187354 | 0.513100 | 0.418591 | 0.212702 | 0.642616 | 0.000000 | 0.142931 | 0.129696 | 0.301626 | 0.766614 | 0.203250 | 0.844816 | nan | 0.376623 | 0.457826 | 0.004217 | 0.824653 | 0.215302 | 0.164702 | 0.225815 | 0.231431 | 0.120801 | 0.186616 | 0.451319 | 0.470832 | 0.087495 | 0.153301 | 0.544906 | 0.182563 | 0.430443 | 0.992222 | nan | nan | 0.355476 | nan | 0.777817 | 0.206918 | 0.203000 | nan | 0.701797 | 0.644866 | nan | 0.072386 | 0.539373 | nan | 0.976251 | nan | 0.524810 | 0.042401 | 0.054744 | nan | 0.517624 | 0.179257 | 0.156417 | 0.501590 |
| 4 | 1 | 0.326510 | 0.382360 | 0.349354 | 0.363203 | 0.393893 | 0.888701 | 0.000000 | 0.405331 | 0.668616 | 0.000000 | 0.313901 | 0.352306 | 0.407648 | 0.321434 | 0.463374 | 0.871888 | 0.658219 | 0.658055 | 0.650034 | 0.483276 | 0.363754 | 0.954924 | 0.478227 | 0.205476 | 0.158576 | nan | 0.276029 | 0.591157 | 0.268567 | 0.343721 | nan | 0.385224 | 0.353986 | 0.388484 | 0.564873 | 0.276074 | 0.000000 | 0.182873 | 0.538961 | 0.887518 | 0.427108 | 0.874735 | 0.301100 | 0.265369 | 0.358380 | 0.200588 | 0.174385 | 0.153127 | 0.503806 | 0.419448 | nan | 0.210826 | 0.970391 | 0.293036 | 0.367507 | 0.000000 | 0.059788 | 0.356727 | 0.388200 | nan | 0.710634 | 0.304508 | 0.082400 | 0.348962 | 0.388531 | 0.000000 | 0.283373 | 0.279525 | 0.429647 | 0.298782 | 0.692821 | 0.361492 | 0.573026 | 0.320777 | 0.340110 | 0.359115 | 0.278471 | nan | nan | 0.610179 |
# using optimal n_neighbors to impute null values
impute = KNNImputer(n_neighbors=5)
df3_treated = df3.copy()
df3_treated[:] = impute.fit_transform(df3)
df3_treated.head()
| Bankrupt? | ROA(C) before interest and depreciation before interest | Operating Profit Rate | Pre-tax net Interest Rate | After-tax net Interest Rate | Non-industry income and expenditure/revenue | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Persistent EPS in the Last Four Seasons | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Regular Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Current Ratio | Quick Ratio | Interest Expense Ratio | Total debt/Total net worth | Net worth/Assets | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Operating profit/Paid-in capital | Net profit before tax/Paid-in capital | Inventory and accounts receivable/Net value | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Net Worth Turnover Rate (times) | Revenue per person | Operating profit per person | Allocation rate per person | Working Capital to Total Assets | Quick Assets/Total Assets | Current Assets/Total Assets | Cash/Total Assets | Quick Assets/Current Liability | Cash/Current Liability | Current Liability to Assets | Operating Funds to Liability | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Current Liabilities/Equity | Long-term Liability to Current Assets | Retained Earnings to Total Assets | Total income/Total expense | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Working capitcal Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Equity to Long-term Liability | Cash Flow to Total Assets | Cash Flow to Liability | CFO to Assets | Cash Flow to Equity | Current Liability to Current Assets | Net Income to Total Assets | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Net Income to Stockholder's Equity | Liability to Equity | Degree of Financial Leverage (DFL) | Interest Coverage Ratio (Interest expense to EBIT) | Equity to Liability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000000 | 0.203722 | 0.374509 | 0.151650 | 0.127733 | 0.343308 | 0.000000 | 0.000000 | 0.283229 | 0.706628 | 0.000000 | 0.124745 | 0.136408 | 0.235151 | 0.190110 | 0.481353 | 0.757374 | 0.182271 | 0.181879 | 0.111907 | 0.422145 | 0.168182 | 0.113307 | 0.145023 | 0.077072 | 0.051381 | 0.052024 | 0.680018 | 0.203203 | 0.207740 | 0.697629 | 0.148703 | 0.372072 | 0.019208 | 0.306109 | 0.707278 | 0.220471 | 0.000000 | 0.115211 | 0.441558 | 0.680506 | 0.388253 | 0.761951 | 0.040934 | 0.169899 | 0.190643 | 0.107858 | 0.081456 | 0.071253 | 0.673487 | 0.285579 | 0.302838 | 0.044862 | 0.616485 | 0.102913 | 0.671780 | 0.480556 | 0.138952 | 0.156989 | 0.954206 | 0.373081 | 0.809321 | 0.117400 | 0.045800 | 0.402994 | 0.532432 | 0.360066 | 0.410855 | 0.427092 | 0.181028 | 0.369944 | 0.898775 | 0.157910 | 0.883582 | 0.019007 | 0.390126 | 0.128426 | 0.686027 | 0.155278 | 0.139947 | 0.415655 |
| 1 | 1.000000 | 0.323408 | 0.328100 | 0.367647 | 0.358432 | 0.565394 | 0.000000 | 0.000000 | 0.382014 | 0.630604 | 0.000000 | 0.456584 | 0.315015 | 0.384052 | 0.228918 | 0.418942 | 0.563321 | 0.655577 | 0.661253 | 0.614004 | 0.552480 | 0.381382 | 0.675826 | 0.416019 | 0.205263 | 0.171846 | 0.586350 | 0.670153 | 0.342927 | 0.215339 | 0.453845 | 0.483273 | 0.345047 | 0.370899 | 0.293577 | 0.501582 | 0.310840 | 0.967955 | 0.119762 | 0.341991 | 0.305572 | 0.353012 | 0.627259 | 0.298758 | 0.129699 | 0.182419 | 0.206085 | 0.168726 | 0.218375 | 0.260431 | 0.374931 | 0.126673 | 0.225597 | 0.180904 | 0.291798 | 0.288100 | 0.666008 | 0.373919 | 0.358384 | 0.375409 | 0.341744 | 0.877496 | 0.305566 | 0.249000 | 0.429871 | 0.588439 | 0.654864 | 0.440620 | 0.440867 | 0.380992 | 0.436871 | 0.693546 | 0.363867 | 0.839542 | 0.382399 | 0.554549 | 0.358727 | 0.676055 | 0.505796 | 0.468626 | 0.467063 |
| 2 | 1.000000 | 0.161290 | 0.151004 | 0.089096 | 0.243352 | 0.152061 | 0.000000 | 0.054421 | 0.293241 | 0.769981 | 0.000000 | 0.414594 | 0.020608 | 0.130187 | 0.064352 | 0.317425 | 0.574425 | 0.502843 | 0.507239 | 0.504907 | 0.687428 | 0.180587 | 0.543995 | 0.253305 | 0.393813 | 0.227501 | 0.117832 | 0.706628 | 0.203435 | 0.224359 | 0.601556 | 0.479164 | 0.327065 | 0.135050 | 0.650624 | 0.583070 | 0.267221 | 0.080663 | 0.180671 | 0.179654 | 0.626935 | 0.096687 | 0.609916 | 0.556764 | 0.346787 | 0.602806 | 0.053062 | 0.257104 | 0.137723 | 0.448792 | 0.319807 | 0.642068 | 0.600999 | 0.343724 | 0.838049 | 0.692423 | 0.406218 | 0.240773 | 0.194956 | 0.314668 | 0.517157 | 0.000000 | 0.358817 | 0.076100 | 0.446071 | 0.346640 | 0.461411 | 0.454597 | 0.449760 | 0.258535 | 0.455611 | 0.367951 | 0.189685 | 0.559461 | 0.471413 | 0.390052 | 0.002340 | 0.712867 | 0.213346 | 0.179485 | 0.415723 |
| 3 | 1.000000 | 0.050041 | 0.153914 | 0.108429 | 0.143912 | 0.129771 | 0.000000 | 0.000000 | 0.483825 | 0.437622 | 0.000000 | 0.185079 | 0.157017 | 0.465419 | 0.155559 | 0.322671 | 0.414024 | 0.268563 | 0.268213 | 0.307487 | 0.410611 | 0.120284 | 0.221056 | 0.569790 | 0.143094 | 0.123225 | 0.187354 | 0.513100 | 0.418591 | 0.212702 | 0.642616 | 0.000000 | 0.142931 | 0.129696 | 0.301626 | 0.766614 | 0.203250 | 0.844816 | 0.124754 | 0.376623 | 0.457826 | 0.004217 | 0.824653 | 0.215302 | 0.164702 | 0.225815 | 0.231431 | 0.120801 | 0.186616 | 0.451319 | 0.470832 | 0.087495 | 0.153301 | 0.544906 | 0.182563 | 0.430443 | 0.992222 | 0.240146 | 0.162740 | 0.355476 | 0.426777 | 0.777817 | 0.206918 | 0.203000 | 0.391394 | 0.701797 | 0.644866 | 0.411397 | 0.072386 | 0.539373 | 0.450311 | 0.976251 | 0.213852 | 0.524810 | 0.042401 | 0.054744 | 0.114096 | 0.517624 | 0.179257 | 0.156417 | 0.501590 |
| 4 | 1.000000 | 0.326510 | 0.382360 | 0.349354 | 0.363203 | 0.393893 | 0.888701 | 0.000000 | 0.405331 | 0.668616 | 0.000000 | 0.313901 | 0.352306 | 0.407648 | 0.321434 | 0.463374 | 0.871888 | 0.658219 | 0.658055 | 0.650034 | 0.483276 | 0.363754 | 0.954924 | 0.478227 | 0.205476 | 0.158576 | 0.533523 | 0.276029 | 0.591157 | 0.268567 | 0.343721 | 0.081112 | 0.385224 | 0.353986 | 0.388484 | 0.564873 | 0.276074 | 0.000000 | 0.182873 | 0.538961 | 0.887518 | 0.427108 | 0.874735 | 0.301100 | 0.265369 | 0.358380 | 0.200588 | 0.174385 | 0.153127 | 0.503806 | 0.419448 | 0.233590 | 0.210826 | 0.970391 | 0.293036 | 0.367507 | 0.000000 | 0.059788 | 0.356727 | 0.388200 | 0.377554 | 0.710634 | 0.304508 | 0.082400 | 0.348962 | 0.388531 | 0.000000 | 0.283373 | 0.279525 | 0.429647 | 0.298782 | 0.692821 | 0.361492 | 0.573026 | 0.320777 | 0.340110 | 0.359115 | 0.278471 | 0.239482 | 0.719623 | 0.610179 |
# plotting medians to see deviations
# plotting untreated data
sns.lineplot(list(range(81)),df3.median(), color='#10c4e8' , label = 'Untreated')
# plotting treated data
sns.lineplot(list(range(81)),df3_treated.median(), color='#053463', label = 'Treated')
# adding a title
plt.title('Comparing Medians before and after treatment',fontsize = 20)
# adding y-axis label
plt.ylabel('Medians', fontsize = 15)
# adding x-axis label
plt.xlabel('Features', fontsize = 15)
plt.legend()
plt.show()
|
Null value treatment summary and inferences:
1. There were 4 features with above 20% null values but not high enough to be dropped 2. With n_neighbors for KNN imputation as 5, we imputed the nan values. 3. Ploting the medians showed us that the treated dataset roughly overlaps the untreated dataset, pointing to the fact that the orignal dataset has been preserved. |
|
|
Read more about the KNNImputer and why is it gaining traction in the Data Science community :
KNNImputer: A robust way to impute missing values (using Scikit-Learn) |
|
Lets find the features exhibiting high multicolinearity using Variance Inflation Factor and remove them
|
# preparing data for VIF filtering
const = pd.DataFrame(np.ones(df3_treated.shape[0]),columns=['intercept']) # creating a constant
X = df3_treated.drop('Bankrupt?',axis = 1) # independent variables
X.insert(loc = 0, column = 'intercept', value = const) # adding constant
# automating vif filtering process
cols = list(X.columns)
while len(cols) > 0:
vif_df = pd.DataFrame()
num = X[cols]
vif_df['Features'] = num.columns
vif_df['VIF'] = [vif(num.values,i)
for i in range(len(num.columns))]
vif_df.set_index('Features',inplace=True)
feature_max = vif_df.iloc[1:].idxmax()
max_vif = max(vif_df['VIF'][1:])
if max_vif > 5:
cols.remove(feature_max.values)
print('Feature with high vif:', feature_max.values[0],', VIF:', max_vif)
else:
break
Feature with high vif: Liability to Equity , VIF: 572.120896078477 Feature with high vif: Regular Net Profit Growth Rate , VIF: 85.50442245943357 Feature with high vif: Current Assets/Total Assets , VIF: 73.49179139791987 Feature with high vif: Quick Assets/Current Liability , VIF: 47.00193212926744 Feature with high vif: Net worth/Assets , VIF: 36.282349677694874 Feature with high vif: Working Capital to Total Assets , VIF: 29.341823809834885 Feature with high vif: Pre-tax net Interest Rate , VIF: 25.78202182071459 Feature with high vif: Net profit before tax/Paid-in capital , VIF: 24.646853347608154 Feature with high vif: Current Liability to Assets , VIF: 23.965734134599696 Feature with high vif: Net Income to Total Assets , VIF: 23.6283388270775 Feature with high vif: Current Liabilities/Equity , VIF: 18.293768490145236 Feature with high vif: Current Ratio , VIF: 15.167108116199229 Feature with high vif: Persistent EPS in the Last Four Seasons , VIF: 14.087195082276004 Feature with high vif: Cash/Current Liability , VIF: 13.467357787367034 Feature with high vif: Current Liability to Current Assets , VIF: 13.018560212928804 Feature with high vif: Net Income to Stockholder's Equity , VIF: 12.110878239672658 Feature with high vif: CFO to Assets , VIF: 11.369219904601827 Feature with high vif: Interest Coverage Ratio (Interest expense to EBIT) , VIF: 10.862309681780967 Feature with high vif: Total debt/Total net worth , VIF: 10.688804768309849 Feature with high vif: Operating Funds to Liability , VIF: 10.155056632552416 Feature with high vif: Net Worth Turnover Rate (times) , VIF: 9.926398040710609 Feature with high vif: Quick Assets/Total Assets , VIF: 9.081894949518146 Feature with high vif: Cash Flow to Total Assets , VIF: 8.97116729215185 Feature with high vif: After-tax net Interest Rate , VIF: 8.545054513906702 Feature with high vif: Working capitcal Turnover Rate , VIF: 7.440410790255586 Feature with high vif: Operating profit/Paid-in capital , VIF: 7.269877023217225 Feature with high vif: Equity to Liability , VIF: 6.80183180092279 Feature with high vif: Long-term Liability to Current Assets , VIF: 6.653622642744055 Feature with high vif: ROA(C) before interest and depreciation before interest , VIF: 6.387609506373427 Feature with high vif: Operating Profit Rate , VIF: 5.841002170414024 Feature with high vif: Inventory and accounts receivable/Net value , VIF: 5.7705618761157815 Feature with high vif: Interest Expense Ratio , VIF: 5.370022464396947 Feature with high vif: Total income/Total expense , VIF: 5.265170696033466
# remvoing features with high VIF
cols = ['Bankrupt?']
cols.extend(vif_df.index[1:])
df4 = df3_treated.loc[:,cols]
df4.shape
(6819, 48)
|
Columns with VIF > 5 have been removed. There are 48 columns remaining.
|
|
Lets perform the Train-Test split using following steps:-
1. Specify Target and predictor variables. 2. Call train_test_split from sklearn and perform the split. 3. Print the shapes of all the splits. |
# specifying target and predictor variables
X = df4.drop('Bankrupt?',axis = 1)
y = df4['Bankrupt?']
Xc = sm.add_constant(X)
# performing the train-test split
X_train, X_test, y_train, y_test = train_test_split(Xc, y, test_size=0.3, random_state=42)
# printing shapes of the splits
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)
(4773, 48) (4773,) (2046, 48) (2046,)
|
Our Data analysis and preparation is complete, now we move onto the next stage where we construct predictive models using various predictive algorithms and select the best one for deployment. Our aim is to maximise recall as letting unhealthy companies slip by is more costly than classifying some of the healthy companies as unhealthy.
We will explore and optimise the following models:- 1. Logistic Regression(Maximum likelyhood estimation) 2. Decision Tree 3. Random Forest 4. Boosting 5. K-Nearest Neighbors 6. Naïve Bayes |
|
We will follow the following steps for Logisitc Regression:-
1. Build a full Logistic Regression model using Maximum likelyhood estimation 2. Plot Receiver Operating Characteristic(ROC) and get Area under Curve(AUC) 3. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 4. Perform recursive feature elimination(RFE) 5. Build a model with features that survived RFE. 6. Repeat steps 2 and 3 for RFE model 7. Compare both models |
# building full model
model = sm.Logit(y_train,X_train).fit()
Optimization terminated successfully.
Current function value: 0.071186
Iterations 11
# comparing train and test ROC scores
y_train_prob = model.predict(X_train)
print('Train ROC-AUC:', roc_auc_score(y_train,y_train_prob))
y_test_prob = model.predict(X_test)
print('Test ROC-AUC:', roc_auc_score(y_test,y_test_prob))
Train ROC-AUC: 0.949939325002053 Test ROC-AUC: 0.9555972482801751
# getting predictions for test data
y_prob = model.predict(X_test)
def plot_roc(y_test,y_prob):
# getting fpr,tpr for ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
# plotting the ROC curve
plt.plot(fpr, tpr)
# setting limits for x and y axes
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
# plot the straight line showing worst prediction for the model(Junk model)
plt.plot([0, 1], [0, 1],'r--')
# add plot and axes labels
# set text size using 'fontsize'
plt.title('ROC curve', fontsize = 20)
plt.xlabel('False positive rate', fontsize = 15)
plt.ylabel('True positive rate', fontsize = 15)
# add the AUC score to the plot
# 'x' and 'y' gives position of the text
# 's' is the text
# use round() to round-off the AUC score upto 4 digits
plt.text(x = 0.22, y = 0.65, s = 'AUC Score: '+ str(round(metrics.roc_auc_score(y_test, y_prob),4)))
# plot the grid
plt.grid(True)
plot_roc(y_test,y_prob)
# building threshold scorecard
thresh_score = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score(model,thresh):
y_prob = model.predict(X_test)
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score
thresh_score = thresh_score.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# calculating metrics for different threshs'
values = np.linspace(0.1,0.9,9)
for i in values:
cal_score(model,i)
thresh_score
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.335294 | 0.730769 | 0.459677 | 0.429880 |
| 1 | 0.200000 | 0.423077 | 0.564103 | 0.483516 | 0.459989 |
| 2 | 0.300000 | 0.473684 | 0.461538 | 0.467532 | 0.446713 |
| 3 | 0.400000 | 0.566038 | 0.384615 | 0.458015 | 0.440764 |
| 4 | 0.500000 | 0.771429 | 0.346154 | 0.477876 | 0.465247 |
| 5 | 0.600000 | 0.750000 | 0.230769 | 0.352941 | 0.341121 |
| 6 | 0.700000 | 0.714286 | 0.128205 | 0.217391 | 0.208204 |
| 7 | 0.800000 | 0.692308 | 0.115385 | 0.197802 | 0.188968 |
| 8 | 0.900000 | 0.833333 | 0.064103 | 0.119048 | 0.114224 |
# creating function for youdens table
def youdens_table(tpr,fpr,thresholds):
youdens_table = pd.DataFrame({'TPR': tpr,
'FPR': fpr,
'Threshold': thresholds})
youdens_table['Difference'] = youdens_table.TPR - youdens_table.FPR
youdens_table = youdens_table.sort_values('Difference', ascending = False).reset_index(drop = True)
return youdens_table
# calling youdens_table
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
youdens_table(tpr,fpr,thresholds).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.948718 | 0.157012 | 0.020431 | 0.791706 |
| 1 | 0.935897 | 0.148374 | 0.022779 | 0.787523 |
| 2 | 0.961538 | 0.177846 | 0.016046 | 0.783693 |
| 3 | 0.987179 | 0.205285 | 0.011671 | 0.781895 |
| 4 | 0.974359 | 0.193089 | 0.013839 | 0.781270 |
# adding thresh where tpr-fpr is maximum to the score board
cal_score(model,0.020431)
thresh_score
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.335294 | 0.730769 | 0.459677 | 0.429880 |
| 1 | 0.200000 | 0.423077 | 0.564103 | 0.483516 | 0.459989 |
| 2 | 0.300000 | 0.473684 | 0.461538 | 0.467532 | 0.446713 |
| 3 | 0.400000 | 0.566038 | 0.384615 | 0.458015 | 0.440764 |
| 4 | 0.500000 | 0.771429 | 0.346154 | 0.477876 | 0.465247 |
| 5 | 0.600000 | 0.750000 | 0.230769 | 0.352941 | 0.341121 |
| 6 | 0.700000 | 0.714286 | 0.128205 | 0.217391 | 0.208204 |
| 7 | 0.800000 | 0.692308 | 0.115385 | 0.197802 | 0.188968 |
| 8 | 0.900000 | 0.833333 | 0.064103 | 0.119048 | 0.114224 |
| 9 | 0.020431 | 0.193211 | 0.948718 | 0.321041 | 0.275123 |
|
1. Our full Logistic Regression(MLE) model gave us an ROC of 0.9556.
2. Maximum precision was 0.833333 at thresholds 0.9 3. Maximum Recall was 0.948718 at threshold 0.020431 obtained through youdens index. 4. Maximum F1 was 0.483516 at threshold 0.2 5. Maximum kappa was 0.459989 at threshod 0.2 |
# choosing best features through RFE cross validation
lor = LogisticRegression(fit_intercept=False)
rfecv = RFECV(lor,cv = 3, scoring = 'roc_auc')
rfecv.fit(Xc,y)
RFECV(cv=3, estimator=LogisticRegression(fit_intercept=False),
scoring='roc_auc')
# printing best features
rfecv.n_features_
26
# converting the rank into a dataframe for column extraction
rfecv_df = pd.DataFrame(rfecv.ranking_,index = X_train.columns, columns=['Select'])
# extracting columns
cols = rfecv_df[rfecv_df['Select']==1].index
cols = cols.insert(loc = 0, item='const')
# building model with best features
model_logit_rfe = sm.Logit(y_train,X_train[cols]).fit()
Optimization terminated successfully.
Current function value: 0.072575
Iterations 10
# comparing train and test ROC scores
y_train_prob = model_logit_rfe.predict(X_train[cols])
print('Train-AUC:', roc_auc_score(y_train,y_train_prob))
y_test_prob = model_logit_rfe.predict(X_test[cols])
print('Test-AUC:', roc_auc_score(y_test,y_test_prob))
Train-AUC: 0.9474195638091125 Test-AUC: 0.9593105065666041
# getting predictions for test data
y_prob_logit_rfe = model_logit_rfe.predict(X_test[cols])
plot_roc(y_test, y_prob_logit_rfe)
# making RFE thresh scoreboard
thresh_score_logit_rfe = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score2(model,thresh):
y_prob = model.predict(X_test[cols])
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_logit_rfe
thresh_score_logit_rfe = thresh_score_logit_rfe.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.1,0.9,9)
for i in values:
cal_score2(model_logit_rfe,i)
thresh_score_logit_rfe
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.331429 | 0.743590 | 0.458498 | 0.428349 |
| 1 | 0.200000 | 0.428571 | 0.576923 | 0.491803 | 0.468554 |
| 2 | 0.300000 | 0.500000 | 0.500000 | 0.500000 | 0.480183 |
| 3 | 0.400000 | 0.600000 | 0.384615 | 0.468750 | 0.452442 |
| 4 | 0.500000 | 0.685714 | 0.307692 | 0.424779 | 0.410866 |
| 5 | 0.600000 | 0.782609 | 0.230769 | 0.356436 | 0.345064 |
| 6 | 0.700000 | 0.733333 | 0.141026 | 0.236559 | 0.227054 |
| 7 | 0.800000 | 0.800000 | 0.102564 | 0.181818 | 0.174667 |
| 8 | 0.900000 | 0.833333 | 0.064103 | 0.119048 | 0.114224 |
# calling_youdens table
fpr_logit_rfe, tpr_logit_rfe, thresholds_logit_rfe = roc_curve(y_test, y_prob_logit_rfe)
youdens_table(tpr_logit_rfe,fpr_logit_rfe,thresholds_logit_rfe).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.974359 | 0.159553 | 0.020815 | 0.814806 |
| 1 | 0.961538 | 0.148374 | 0.023002 | 0.813164 |
| 2 | 0.935897 | 0.125508 | 0.030921 | 0.810389 |
| 3 | 0.948718 | 0.144817 | 0.024093 | 0.803901 |
| 4 | 0.923077 | 0.120427 | 0.034413 | 0.802650 |
# adding thresh where tpr-fpr is maximum to the score board
cal_score2(model_logit_rfe,0.020815)
thresh_score_logit_rfe
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.331429 | 0.743590 | 0.458498 | 0.428349 |
| 1 | 0.200000 | 0.428571 | 0.576923 | 0.491803 | 0.468554 |
| 2 | 0.300000 | 0.500000 | 0.500000 | 0.500000 | 0.480183 |
| 3 | 0.400000 | 0.600000 | 0.384615 | 0.468750 | 0.452442 |
| 4 | 0.500000 | 0.685714 | 0.307692 | 0.424779 | 0.410866 |
| 5 | 0.600000 | 0.782609 | 0.230769 | 0.356436 | 0.345064 |
| 6 | 0.700000 | 0.733333 | 0.141026 | 0.236559 | 0.227054 |
| 7 | 0.800000 | 0.800000 | 0.102564 | 0.181818 | 0.174667 |
| 8 | 0.900000 | 0.833333 | 0.064103 | 0.119048 | 0.114224 |
| 9 | 0.020815 | 0.194872 | 0.974359 | 0.324786 | 0.278973 |
|
1. Our Logistic Regression model after RFE gave us an ROC of 0.9593 which is barely above our full model.
2. Maximum observed precision was 0.833333 at threshold 0.9. 3. Maximum observed Recall was 0.974359 at threshold 0.020815 obtained through youdens index 4. Maximum observed F1 was 0.5 at threshold 0.3 5. Maximum observed kappa was 0.480183 at threshod 0.3 |
# making a composite thresh score card to compare performace
df_score = pd.concat([thresh_score,thresh_score_logit_rfe],keys = ['LoR full model','LoR RFE'])
df_score.head()
| Threshold | Precision | Recall | F1-score | Kappa | ||
|---|---|---|---|---|---|---|
| LoR full model | 0 | 0.100000 | 0.335294 | 0.730769 | 0.459677 | 0.429880 |
| 1 | 0.200000 | 0.423077 | 0.564103 | 0.483516 | 0.459989 | |
| 2 | 0.300000 | 0.473684 | 0.461538 | 0.467532 | 0.446713 | |
| 3 | 0.400000 | 0.566038 | 0.384615 | 0.458015 | 0.440764 | |
| 4 | 0.500000 | 0.771429 | 0.346154 | 0.477876 | 0.465247 |
# finding out at which thresh and model Recall/TPR is maximised
df_score.sort_values(by = 'Recall',ascending = False).head()
| Threshold | Precision | Recall | F1-score | Kappa | ||
|---|---|---|---|---|---|---|
| LoR RFE | 9 | 0.020815 | 0.194872 | 0.974359 | 0.324786 | 0.278973 |
| LoR full model | 9 | 0.020431 | 0.193211 | 0.948718 | 0.321041 | 0.275123 |
| LoR RFE | 0 | 0.100000 | 0.331429 | 0.743590 | 0.458498 | 0.428349 |
| LoR full model | 0 | 0.100000 | 0.335294 | 0.730769 | 0.459677 | 0.429880 |
| LoR RFE | 1 | 0.200000 | 0.428571 | 0.576923 | 0.491803 | 0.468554 |
# making a function to plot confusion matrix
def plot_cm(y_test,y_pred):
cm = confusion_matrix(y_test, y_pred)
conf_matrix = pd.DataFrame(data = cm,columns = ['Predicted:0','Predicted:1'],
index = ['Actual:0','Actual:1'])
sns.heatmap(conf_matrix, annot = True, fmt = 'd', cmap = ListedColormap(['#5a99d1']), cbar = False,
linewidths = 0.1, annot_kws = {'size':25})
plt.show()
# plotting confusion matrix for selected logit model(RFE) and thresh
y_pred_rfe = [0 if x < 0.020815 else 1 for x in model_logit_rfe.predict(X_test[cols])]
plot_cm(y_test,y_pred_rfe)
# creating scorecard to compare best models from different algorithms and adding best logit model
score_card = pd.DataFrame(columns=['Model Name','Threshold','ROC-AUC','Recall','Precision','Kappa'])
score_card = score_card.append({'Model Name': 'Logistic Regression with RFE',
'Threshold': 0.020815,
'ROC-AUC': roc_auc_score(y_test, y_prob_logit_rfe),
'Recall': metrics.recall_score(y_test,y_pred_rfe),
'Precision' : metrics.precision_score(y_test,y_pred_rfe),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_rfe)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
|
Our aim is to maximise recall as letting unhealthy companies slip by is more costly than classifying some of the healthy companies as unhealthy. Following that line of logic our recall is maximised with the RFE model at thresh 0.020815. Our best Logit model is the RFE model.
|
|
We will follow the following steps for Decision Tree:-
1. As Decision trees are not sensitive to outliers, we will proceed with raw data 2. Create an iterable to tune parameters like max_depth, min_samples_split, min_samples_leaf, max_features and max_leaf_nodes using RandomisedSearchCV 3. Build a model with the best hyperparameters 4. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 5. Plot confusion matrix at best threshold 6. Add model with best threshold(recall maximum) to the score card 7. Look at feature importance |
# using raw data to do the train test split(70:30)
X_dtc = df.drop(['Bankrupt?'],axis = 1)
y_dtc = df['Bankrupt?']
X_train_dtc, X_test_dtc, y_train_dtc, y_test_dtc = train_test_split(X_dtc, y_dtc, test_size=0.3, random_state=42)
# printing their shapes
print(X_train_dtc.shape)
print(y_train_dtc.shape)
print(X_test_dtc.shape)
print(y_test_dtc.shape)
(4773, 95) (4773,) (2046, 95) (2046,)
# looking for the best parameters using random search
ran = []
for i in range(40,61,2):
dtc = DecisionTreeClassifier(random_state=4)
params = {'max_depth': sp_randint(2,6),
'min_samples_split': sp_randint(2,100),
'min_samples_leaf': sp_randint(5,50),
'max_features':sp_randint(2,50),
'max_leaf_nodes': sp_randint(2,100)}
rsearch = RandomizedSearchCV(estimator=dtc, param_distributions=params,n_iter=500,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = i)
rsearch.fit(X_dtc,y_dtc)
dtc = DecisionTreeClassifier(**rsearch.best_params_,random_state=4)
dtc.fit(X_train_dtc,y_train_dtc)
y_train_prob = dtc.predict_proba(X_train_dtc)[:,1]
train_roc = round(roc_auc_score(y_train_dtc,y_train_prob),4)
y_test_prob = dtc.predict_proba(X_test_dtc)[:,1]
test_roc = round(roc_auc_score(y_test_dtc,y_test_prob),4)
# storing train, test roc and random_state number so that we can get similar results
ran.append((train_roc,test_roc,i))
# sorting list of train,test roc-auc and random_state number based on highest test score in ascending order
ran.sort(key = lambda x: x[1],reverse=True)
ran
[(0.9276, 0.9063, 42), (0.9276, 0.9063, 58), (0.9272, 0.9062, 52), (0.9276, 0.9062, 60), (0.9253, 0.8837, 54), (0.9596, 0.883, 40), (0.9468, 0.867, 44), (0.9092, 0.8569, 56), (0.907, 0.856, 50), (0.9087, 0.8546, 48), (0.9462, 0.8232, 46)]
# choosing best fit random state to extract hyperparameters
dtc = DecisionTreeClassifier(random_state=4)
params = {'max_depth': sp_randint(2,6),
'min_samples_split': sp_randint(2,100),
'min_samples_leaf': sp_randint(5,50),
'max_features':sp_randint(2,50),
'max_leaf_nodes': sp_randint(2,100)}
rsearch = RandomizedSearchCV(estimator=dtc, param_distributions=params,n_iter=500,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = 42)
rsearch.fit(X_dtc,y_dtc)
RandomizedSearchCV(cv=3, estimator=DecisionTreeClassifier(random_state=4),
n_iter=500, n_jobs=-1,
param_distributions={'max_depth': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21876a60>,
'max_features': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21876d60>,
'max_leaf_nodes': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21876df0>,
'min_samples_leaf': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21876a00>,
'min_samples_split': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe6cc115e0>},
random_state=42, scoring='roc_auc')
# print best fit hyperparameters
rsearch.best_params_
{'max_depth': 5,
'max_features': 7,
'max_leaf_nodes': 84,
'min_samples_leaf': 34,
'min_samples_split': 72}
# calling dtc with best hyperparameters
dtc = DecisionTreeClassifier(**rsearch.best_params_,random_state=4)
dtc.fit(X_train_dtc,y_train_dtc)
y_train_prob_dtc = dtc.predict_proba(X_train_dtc)[:,1]
print('Trian-AUC:', roc_auc_score(y_train_dtc,y_train_prob_dtc))
y_test_prob_dtc = dtc.predict_proba(X_test_dtc)[:,1]
print('Test-AUC:', roc_auc_score(y_test_dtc,y_test_prob_dtc))
Trian-AUC: 0.9276408830873384 Test-AUC: 0.9062760579528872
# plotting roc curve
plot_roc(y_test_dtc,y_test_prob_dtc)
# creating thresh score card for DecisionTree
thresh_score_dtc = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_dtc(model,thresh):
y_prob = model.predict_proba(X_test_dtc)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_dtc
thresh_score_dtc = thresh_score_dtc.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.1,0.9,9)
for i in values:
cal_score_dtc(dtc,i)
thresh_score_dtc
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.363057 | 0.730769 | 0.485106 | 0.457470 |
| 1 | 0.200000 | 0.411765 | 0.538462 | 0.466667 | 0.442583 |
| 2 | 0.300000 | 0.400000 | 0.358974 | 0.378378 | 0.355123 |
| 3 | 0.400000 | 0.432432 | 0.205128 | 0.278261 | 0.260110 |
| 4 | 0.500000 | 0.432432 | 0.205128 | 0.278261 | 0.260110 |
| 5 | 0.600000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 6 | 0.700000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 7 | 0.800000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
# calling youdens_table
fpr_dtc, tpr_dtc, thresholds_dtc = roc_curve(y_test_dtc, y_test_prob_dtc)
youdens_table(tpr_dtc,fpr_dtc,thresholds_dtc).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.807692 | 0.098577 | 0.074830 | 0.709115 |
| 1 | 0.807692 | 0.104675 | 0.029412 | 0.703018 |
| 2 | 0.756410 | 0.064533 | 0.075472 | 0.691878 |
| 3 | 0.730769 | 0.050813 | 0.116279 | 0.679956 |
| 4 | 0.833333 | 0.161077 | 0.025316 | 0.672256 |
# adding thresh where tpr-fpr is maximum to the score board
cal_score_dtc(dtc,0.074830)
thresh_score_dtc
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.363057 | 0.730769 | 0.485106 | 0.457470 |
| 1 | 0.200000 | 0.411765 | 0.538462 | 0.466667 | 0.442583 |
| 2 | 0.300000 | 0.400000 | 0.358974 | 0.378378 | 0.355123 |
| 3 | 0.400000 | 0.432432 | 0.205128 | 0.278261 | 0.260110 |
| 4 | 0.500000 | 0.432432 | 0.205128 | 0.278261 | 0.260110 |
| 5 | 0.600000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 6 | 0.700000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 7 | 0.800000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 9 | 0.074830 | 0.317204 | 0.756410 | 0.446970 | 0.415575 |
# plotting confusion matrix for dtc model at youdens index
y_pred_dtc = [0 if x < 0.074830 else 1 for x in dtc.predict_proba(X_test_dtc)[:,1]]
plot_cm(y_test_dtc,y_pred_dtc)
# adding final dtc model to the scorecard
score_card = score_card.append({'Model Name': 'Decision Tree',
'Threshold': 0.074830,
'ROC-AUC': roc_auc_score(y_test_dtc, dtc.predict_proba(X_test_dtc)[:,1]),
'Recall': metrics.recall_score(y_test,y_pred_dtc),
'Precision' : metrics.precision_score(y_test,y_pred_dtc),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_dtc)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
# extracting feature importances
imp = pd.DataFrame(dtc.feature_importances_,columns=['Importance'],index = X_dtc.columns).sort_values(by='Importance',
ascending=False).head(10)
# visualising feature importances
sns.barplot(y = imp.index, x = 'Importance', data = imp,palette='Blues_d')
plt.title('Comparing Feature Importance', fontsize = 20)
plt.show()
|
1. ROC-AUC after hyperparameter tuning was 0.9063.
2. Recall at threshold obtained through youdens index(0.074830) is 0.756410. 3. Net Income to stockholders equity a.k.a return on equity was the most important feature. |
|
We will follow the following steps for Random Forest Classifier:-
1. As Random Forest Classifier is sensitive to high correlation, we will proceed with cleaned data 2. Tune parameters like n_estimators,max_depth, min_samples_split, min_samples_leaf, max_features and max_leaf_nodes using RandomisedSearchCV 3. Build a model with the best hyperparameters 4. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 5. Plot confusion matrix at best threshold 6. Add model with best threshold(recall maximum) to the score card 7. Look at feature importance |
# performing train-test split again as we dont need a constant anymore
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# printing shapes of the splits
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)
(4773, 47) (4773,) (2046, 47) (2046,)
# looking for the best parameters using random search
ran = []
for i in range(40,61,2):
rfc = RandomForestClassifier(random_state=4)
rfc.fit(X_train,y_train)
params = {'n_estimators': sp_randint(50,100),
'max_depth': sp_randint(2,10),
'min_samples_split': sp_randint(2,100),
'min_samples_leaf': sp_randint(10,100),
'max_features':sp_randint(3,50),
'max_leaf_nodes': sp_randint(2,50),
'max_samples': sp_randint(1000,6819)}
rsearch_rfc = RandomizedSearchCV(estimator=rfc, param_distributions=params, scoring = 'roc_auc', cv = 3,
n_iter = 100,n_jobs=-1, random_state = i)
rsearch_rfc.fit(X,y)
rfc = RandomForestClassifier(**rsearch_rfc.best_params_,random_state=4)
rfc.fit(X_train,y_train)
y_train_prob_rfc = rfc.predict_proba(X_train)[:,1]
train_roc = round(roc_auc_score(y_train,y_train_prob_rfc),4)
y_test_prob_rfc = rfc.predict_proba(X_test)[:,1]
test_roc = round(roc_auc_score(y_test,y_test_prob_rfc),4)
# storing train, test roc and random_state number so that we can get similar results
ran.append((train_roc,test_roc,i))
# sorting list of train,test roc-auc and random_state number based on highest test score in ascending order
ran.sort(key = lambda x: x[1],reverse=True)
ran
[(0.977, 0.9482, 42), (0.966, 0.9462, 60), (0.973, 0.9458, 46), (0.9736, 0.9444, 48), (0.9742, 0.9441, 58), (0.9666, 0.944, 40), (0.9641, 0.9431, 44), (0.982, 0.9431, 52), (0.9713, 0.9365, 54), (0.9816, 0.9338, 50), (0.9672, 0.9325, 56)]
# hyperparameter tuning with best random state
rfc = RandomForestClassifier(random_state=4)
rfc.fit(X_train,y_train)
params = {'n_estimators': sp_randint(50,100),
'max_depth': sp_randint(2,10),
'min_samples_split': sp_randint(2,100),
'min_samples_leaf': sp_randint(10,100),
'max_features':sp_randint(3,50),
'max_leaf_nodes': sp_randint(2,50),
'max_samples': sp_randint(1000,6819)}
rsearch_rfc = RandomizedSearchCV(estimator=rfc, param_distributions=params, scoring = 'roc_auc', cv = 3, n_iter = 100,
n_jobs=-1, random_state = 60)
rsearch_rfc.fit(X,y)
RandomizedSearchCV(cv=3, estimator=RandomForestClassifier(random_state=4),
n_iter=100, n_jobs=-1,
param_distributions={'max_depth': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c6ef10>,
'max_features': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c735b0>,
'max_leaf_nodes': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7...
'max_samples': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c73b20>,
'min_samples_leaf': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c73370>,
'min_samples_split': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c73130>,
'n_estimators': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21c624f0>},
random_state=60, scoring='roc_auc')
rsearch_rfc.best_params_
{'max_depth': 5,
'max_features': 7,
'max_leaf_nodes': 35,
'max_samples': 2453,
'min_samples_leaf': 15,
'min_samples_split': 65,
'n_estimators': 89}
# getting train-test roc scores on best params
rfc = RandomForestClassifier(**rsearch_rfc.best_params_,random_state=4)
rfc.fit(X_train,y_train)
y_train_prob_rfc = rfc.predict_proba(X_train)[:,1]
print('Trian-AUC:', roc_auc_score(y_train,y_train_prob_rfc))
y_test_prob_rfc = rfc.predict_proba(X_test)[:,1]
print('Test-AUC:', roc_auc_score(y_test,y_test_prob_rfc))
Trian-AUC: 0.9659824635569842 Test-AUC: 0.9462359287054409
# plotting roc curve
plot_roc(y_test,y_test_prob_rfc)
# creating threshold scorecard
thresh_score_rfc = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_rfc(model,thresh):
y_prob = model.predict_proba(X_test)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_rfc
thresh_score_rfc = thresh_score_rfc.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.1,0.9,9)
for i in values:
cal_score_rfc(rfc,i)
thresh_score_rfc
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.310881 | 0.769231 | 0.442804 | 0.410811 |
| 1 | 0.200000 | 0.507246 | 0.448718 | 0.476190 | 0.456748 |
| 2 | 0.300000 | 0.571429 | 0.205128 | 0.301887 | 0.287537 |
| 3 | 0.400000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 4 | 0.500000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 5 | 0.600000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 6 | 0.700000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 7 | 0.800000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
# calling youdens_table
fpr_rfc, tpr_rfc, thresholds_rfc = roc_curve(y_test, y_test_prob_rfc)
youdens_table(tpr_rfc,fpr_rfc,thresholds_rfc).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.910256 | 0.140752 | 0.050021 | 0.769504 |
| 1 | 0.897436 | 0.136687 | 0.051553 | 0.760749 |
| 2 | 0.897436 | 0.140752 | 0.050046 | 0.756684 |
| 3 | 0.923077 | 0.168699 | 0.039682 | 0.754378 |
| 4 | 0.884615 | 0.132622 | 0.053740 | 0.751993 |
# adding youdens thresh to scorecard
cal_score_rfc(rfc,0.050021)
thresh_score_rfc
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.310881 | 0.769231 | 0.442804 | 0.410811 |
| 1 | 0.200000 | 0.507246 | 0.448718 | 0.476190 | 0.456748 |
| 2 | 0.300000 | 0.571429 | 0.205128 | 0.301887 | 0.287537 |
| 3 | 0.400000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 4 | 0.500000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 5 | 0.600000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 6 | 0.700000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 7 | 0.800000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 9 | 0.050021 | 0.201729 | 0.897436 | 0.329412 | 0.284894 |
y_pred_rfc = [0 if x < 0.050021 else 1 for x in y_test_prob_rfc]
plot_cm(y_test,y_pred_rfc)
# adding final dtc model to the scorecard
score_card = score_card.append({'Model Name': 'Random Forest',
'Threshold': 0.050021,
'ROC-AUC': roc_auc_score(y_test, y_test_prob_rfc),
'Recall': metrics.recall_score(y_test,y_pred_rfc),
'Precision' : metrics.precision_score(y_test,y_pred_rfc),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_rfc)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
# extracting feature importances
imp = pd.DataFrame(rfc.feature_importances_,columns=['Importance'],index = X.columns).sort_values(by='Importance',
ascending=False).head(10)
# visualising feature importances
sns.barplot(y = imp.index, x = 'Importance', data = imp,palette='Blues_d')
plt.title('Comparing Feature Importance', fontsize = 20)
plt.show()
|
1. ROC-AUC after hyperparameter tuning was 0.946236.
2. Recall at threshold obtained through youdens index(0.050021) is 0.897436. 3. Net value per share and borrowing dependency are 2 of the most important features. |
|
We will build the following boosting algorithms:-
1. Adaptive Boosting 2. Gradient Boosting 3. Extreme Graidient Boosting(XGBoost) |
|
We will follow the following steps for Ada Boost:-
1. We will proceed with cleaned data 2. Tune parameters like n_estimators and learning rate using RandomisedSearchCV 3. Build a model with the best hyperparameters 4. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 5. Plot confusion matrix at best threshold 6. Add model with best threshold(recall maximum) to the score card 7. Look at feature importance |
# tuning hyper parameters
ran = []
for i in range(40,61,2):
adac = AdaBoostClassifier(random_state=4)
params = {'n_estimators' : sp_randint(50,100),
'learning_rate': [0.001, 0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3]}
rsearch = RandomizedSearchCV(estimator=adac, param_distributions=params,n_iter=50,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = i)
rsearch.fit(X,y)
adac = AdaBoostClassifier(**rsearch.best_params_,random_state=4)
adac.fit(X_train,y_train)
y_train_prob = adac.predict_proba(X_train)[:,1]
train_roc = round(roc_auc_score(y_train,y_train_prob),4)
y_test_prob = adac.predict_proba(X_test)[:,1]
test_roc = round(roc_auc_score(y_test,y_test_prob),4)
ran.append((train_roc,test_roc,i))
# printing list with best scores and random_state
ran.sort(key = lambda x: x[1],reverse=True)
ran
[(0.9584, 0.9406, 46), (0.9647, 0.9403, 52), (0.9647, 0.9402, 48), (0.9647, 0.9402, 54), (0.9646, 0.9401, 50), (0.9646, 0.9401, 60), (0.9642, 0.94, 40), (0.9642, 0.94, 42), (0.9645, 0.94, 58), (0.9621, 0.9397, 56), (0.9631, 0.9386, 44)]
# using best random_state to replicate results
adac = AdaBoostClassifier(random_state=4)
params = {'n_estimators' : sp_randint(2,100),
'learning_rate': [0.001, 0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3]}
rsearch1 = RandomizedSearchCV(estimator=adac, param_distributions=params,n_iter=50,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = 46)
rsearch1.fit(X,y)
RandomizedSearchCV(cv=3, estimator=AdaBoostClassifier(random_state=4),
n_iter=50, n_jobs=-1,
param_distributions={'learning_rate': [0.001, 0.01, 0.05,
0.1, 0.15, 0.2, 0.25,
0.3],
'n_estimators': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe2325be50>},
random_state=46, scoring='roc_auc')
# printing best params
rsearch1.best_params_
{'learning_rate': 0.1, 'n_estimators': 74}
# using best params to train model
adac = AdaBoostClassifier(**rsearch1.best_params_,random_state=4)
adac.fit(X_train,y_train)
y_train_prob = adac.predict_proba(X_train)[:,1]
print('Trian-AUC:', roc_auc_score(y_train,y_train_prob))
y_test_prob = adac.predict_proba(X_test)[:,1]
print('Test-AUC:', roc_auc_score(y_test,y_test_prob))
Trian-AUC: 0.9642321647440245 Test-AUC: 0.9399950489889515
# plotting roc curve
plot_roc(y_test, y_test_prob)
# building thresh scorecard
thresh_score_ada = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_ada(model,thresh):
y_prob = model.predict_proba(X_test)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_ada
thresh_score_ada = thresh_score_ada.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking performance at different thresholds
values = np.linspace(0.01,0.5,10)
for i in values:
cal_score_ada(adac,i)
thresh_score_ada
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 1 | 0.064444 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 2 | 0.118889 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 3 | 0.173333 | 0.039157 | 1.000000 | 0.075362 | 0.002147 |
| 4 | 0.227778 | 0.045748 | 1.000000 | 0.087493 | 0.015729 |
| 5 | 0.282222 | 0.063882 | 1.000000 | 0.120092 | 0.052163 |
| 6 | 0.336667 | 0.116386 | 0.974359 | 0.207934 | 0.150043 |
| 7 | 0.391111 | 0.228782 | 0.794872 | 0.355301 | 0.314729 |
| 8 | 0.445556 | 0.480000 | 0.615385 | 0.539326 | 0.518710 |
| 9 | 0.500000 | 0.533333 | 0.102564 | 0.172043 | 0.161734 |
# calling youdens index
fpr_ada, tpr_ada, thresholds_ada = roc_curve(y_test, y_test_prob)
youdens_table(tpr_ada,fpr_ada,thresholds_ada).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.897436 | 0.173780 | 0.366408 | 0.723655 |
| 1 | 0.897436 | 0.174289 | 0.366170 | 0.723147 |
| 2 | 0.897436 | 0.175305 | 0.365857 | 0.722131 |
| 3 | 0.923077 | 0.203760 | 0.358847 | 0.719317 |
| 4 | 0.910256 | 0.192073 | 0.361949 | 0.718183 |
# comparing performance at thresh from youdens index
cal_score_ada(adac,0.366408)
thresh_score_ada
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 1 | 0.064444 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 2 | 0.118889 | 0.038123 | 1.000000 | 0.073446 | 0.000000 |
| 3 | 0.173333 | 0.039157 | 1.000000 | 0.075362 | 0.002147 |
| 4 | 0.227778 | 0.045748 | 1.000000 | 0.087493 | 0.015729 |
| 5 | 0.282222 | 0.063882 | 1.000000 | 0.120092 | 0.052163 |
| 6 | 0.336667 | 0.116386 | 0.974359 | 0.207934 | 0.150043 |
| 7 | 0.391111 | 0.228782 | 0.794872 | 0.355301 | 0.314729 |
| 8 | 0.445556 | 0.480000 | 0.615385 | 0.539326 | 0.518710 |
| 9 | 0.500000 | 0.533333 | 0.102564 | 0.172043 | 0.161734 |
| 10 | 0.366408 | 0.167883 | 0.884615 | 0.282209 | 0.233060 |
# plotting confusion matrix at best thresh
y_pred_ada = [0 if x < 0.336667 else 1 for x in adac.predict_proba(X_test)[:,1]]
plot_cm(y_test,y_pred_ada)
# adding final model to scorecard
score_card = score_card.append({'Model Name': 'Adaptive Boosting',
'Threshold': 0.336667,
'ROC-AUC': roc_auc_score(y_test, adac.predict_proba(X_test)[:,1]),
'Recall': metrics.recall_score(y_test,y_pred_ada),
'Precision' : metrics.precision_score(y_test,y_pred_ada),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_ada)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
| 3 | Adaptive Boosting | 0.336667 | 0.939995 | 0.974359 | 0.116386 | 0.150043 |
# checking feature importance
imp = pd.DataFrame(adac.feature_importances_,columns=['Importance'],index = X.columns).sort_values(by='Importance',
ascending=False).head(10)
sns.barplot(y = imp.index, x = 'Importance',palette = 'Blues_d', data = imp)
plt.title('Comparing Feature Importance', fontsize = 20)
plt.show()
|
1. ROC-AUC after hyperparameter tuning was 0.9399.
2. Optimal Threshold is 0.336667 as we got best Recall(0.9743) at this threshold. 3. Non-industry income and expenditure/revenue and borrowing dependency are the most important features |
|
We will follow the following steps for Gradient Boosting:-
1. We will proceed with cleaned data to tune the model 2. First of all we tune the tree based parameters by taking an initial guess for n_estimator and learning_rate. 3. Tune parameters like max_depth, min_samples_split, min_samples_leaf, max_features, max_leaf_nodes and random_state using RandomisedSearchCV. 4. With best tree based parameters got from first tuned model, we tune the boosting parameters like n_estimator and learning_rate. 5. Tune the min_impurity_decrease to see if our best score improves. 6. Build a tuned model with the best hyperparameters 7. Build metric scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 8. Make Youdens table to see best threshold got from maximising youdens index 9. Compare thresholds got from youdens tab;e and metric scorecard 10. Plot confusion matrix at best threshold 11. Add model with best threshold to the score card 12. Look at feature importance |
# train-test split
x_gb = df4.drop(['Bankrupt?'],axis = 1)
y_gb = df4['Bankrupt?']
X_train_gb, X_test_gb, y_train_gb, y_test_gb = train_test_split(x_gb, y_gb, test_size=0.3, random_state=10)
# tuning tree based parameters
gb1=GradientBoostingClassifier(n_estimators=60,learning_rate=0.1)
tuned_params = {'max_depth': sp_randint(2,6),
'min_samples_split': sp_randint(2,100),
'min_samples_leaf': sp_randint(5,50),
'max_features':sp_randint(2,50),
'max_leaf_nodes': sp_randint(2,100),
'random_state':range(10,30)}
rsearch1 = RandomizedSearchCV(estimator=gb1, param_distributions=tuned_params,n_iter=500,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = 10)
rsearch1.fit(x_gb,y_gb)
RandomizedSearchCV(cv=3, estimator=GradientBoostingClassifier(n_estimators=60),
n_iter=500, n_jobs=-1,
param_distributions={'max_depth': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe2328ebe0>,
'max_features': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe23067f10>,
'max_leaf_nodes': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe21af9910>,
'min_samples_leaf': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe2325bb80>,
'min_samples_split': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe20dc9f70>,
'random_state': range(10, 30)},
random_state=10, scoring='roc_auc')
# getting best params
rsearch1.best_params_
{'max_depth': 5,
'max_features': 6,
'max_leaf_nodes': 89,
'min_samples_leaf': 42,
'min_samples_split': 68,
'random_state': 27}
# tuning boosting parameters
gb2=GradientBoostingClassifier(max_depth=5,min_samples_split=68,min_samples_leaf=42,
max_features=6,max_leaf_nodes=89,random_state=27)
lrate = np.linspace(0.05,0.2,20)
tuned_params2 = {'n_estimators': sp_randint(70,110),
'learning_rate': lrate}
rsearch2 = RandomizedSearchCV(estimator=gb2, param_distributions=tuned_params2,n_iter=500,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = 10)
rsearch2.fit(x_gb,y_gb)
RandomizedSearchCV(cv=3,
estimator=GradientBoostingClassifier(max_depth=5,
max_features=6,
max_leaf_nodes=89,
min_samples_leaf=42,
min_samples_split=68,
random_state=27),
n_iter=500, n_jobs=-1,
param_distributions={'learning_rate': array([0.05 , 0.05789474, 0.06578947, 0.07368421, 0.08157895,
0.08947368, 0.09736842, 0.10526316, 0.11315789, 0.12105263,
0.12894737, 0.13684211, 0.14473684, 0.15263158, 0.16052632,
0.16842105, 0.17631579, 0.18421053, 0.19210526, 0.2 ]),
'n_estimators': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe2328e9d0>},
random_state=10, scoring='roc_auc')
# printing best params
rsearch2.best_params_
{'learning_rate': 0.13684210526315793, 'n_estimators': 82}
# tuning min_impurity decrease
gb3=GradientBoostingClassifier(max_depth=5,min_samples_split=68,min_samples_leaf=42,
max_features=6,max_leaf_nodes=89,random_state=27,
n_estimators=82,learning_rate=0.14)
min_imp_dec = np.linspace(0,0.1,5)
tuned_params3 = {'min_impurity_decrease': min_imp_dec}
rsearch3 = RandomizedSearchCV(estimator=gb3, param_distributions=tuned_params3,n_iter=500,scoring='roc_auc',n_jobs=-1,
cv = 3,random_state = 10)
rsearch3.fit(x_gb,y_gb)
RandomizedSearchCV(cv=3,
estimator=GradientBoostingClassifier(learning_rate=0.14,
max_depth=5,
max_features=6,
max_leaf_nodes=89,
min_samples_leaf=42,
min_samples_split=68,
n_estimators=82,
random_state=27),
n_iter=500, n_jobs=-1,
param_distributions={'min_impurity_decrease': array([0. , 0.025, 0.05 , 0.075, 0.1 ])},
random_state=10, scoring='roc_auc')
# printing best params
rsearch3.best_params_
{'min_impurity_decrease': 0.07500000000000001}
# using best params to train model
gb_tuned = GradientBoostingClassifier(max_depth=5,min_samples_split=68,min_samples_leaf=42,
max_features=6,max_leaf_nodes=89,random_state=27,
n_estimators=82,learning_rate=0.14,min_impurity_decrease=0.075)
gb_tuned.fit(X_train_gb,y_train_gb)
y_train_prob_gb = gb_tuned.predict_proba(X_train_gb)[:,1]
print('Train-AUC for GB tuned model:', roc_auc_score(y_train_gb,y_train_prob_gb))
y_test_prob_gb = gb_tuned.predict_proba(X_test_gb)[:,1]
print('Test-AUC for GB tuned model:', roc_auc_score(y_test_gb,y_test_prob_gb))
Train-AUC for GB tuned model: 0.9876173123810701 Test-AUC for GB tuned model: 0.9523568170299037
# plotting roc-curve
plot_roc(y_test_gb,y_test_prob_gb)
# making thresh score
thresh_score_gb_tuned = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_gb(model,thresh):
y_prob = model.predict_proba(X_test_gb)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_gb_tuned
thresh_score_gb_tuned = thresh_score_gb_tuned.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test_gb, y_pred),
'Recall': metrics.recall_score(y_test_gb, y_pred),
'F1-score': metrics.f1_score(y_test_gb, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test_gb, y_pred)},
ignore_index = True)
# checking performance at different thresholds
values = np.linspace(0.01,0.1,10)
for i in values:
cal_score_gb(gb_tuned,i)
thresh_score_gb_tuned
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.151448 | 0.931507 | 0.260536 | 0.212180 |
| 1 | 0.020000 | 0.226148 | 0.876712 | 0.359551 | 0.321035 |
| 2 | 0.030000 | 0.264957 | 0.849315 | 0.403909 | 0.369622 |
| 3 | 0.040000 | 0.288557 | 0.794521 | 0.423358 | 0.391505 |
| 4 | 0.050000 | 0.306818 | 0.739726 | 0.433735 | 0.403656 |
| 5 | 0.060000 | 0.316770 | 0.698630 | 0.435897 | 0.406772 |
| 6 | 0.070000 | 0.333333 | 0.657534 | 0.442396 | 0.414679 |
| 7 | 0.080000 | 0.366412 | 0.657534 | 0.470588 | 0.445164 |
| 8 | 0.090000 | 0.379032 | 0.643836 | 0.477157 | 0.452569 |
| 9 | 0.100000 | 0.394958 | 0.643836 | 0.489583 | 0.465964 |
# calling youdens index
fpr_gb, tpr_gb, thresholds_gb = roc_curve(y_test_gb, y_test_prob_gb)
youdens_table(tpr_gb,fpr_gb,thresholds_gb).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.890411 | 0.116067 | 0.019109 | 0.774344 |
| 1 | 0.904110 | 0.133806 | 0.015549 | 0.770303 |
| 2 | 0.876712 | 0.107957 | 0.021000 | 0.768755 |
| 3 | 0.849315 | 0.084136 | 0.031267 | 0.765179 |
| 4 | 0.986301 | 0.225038 | 0.008147 | 0.761263 |
# checking performance at thresh obtained through youdens index
cal_score_gb(gb_tuned,0.019109)
thresh_score_gb_tuned
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.151448 | 0.931507 | 0.260536 | 0.212180 |
| 1 | 0.020000 | 0.226148 | 0.876712 | 0.359551 | 0.321035 |
| 2 | 0.030000 | 0.264957 | 0.849315 | 0.403909 | 0.369622 |
| 3 | 0.040000 | 0.288557 | 0.794521 | 0.423358 | 0.391505 |
| 4 | 0.050000 | 0.306818 | 0.739726 | 0.433735 | 0.403656 |
| 5 | 0.060000 | 0.316770 | 0.698630 | 0.435897 | 0.406772 |
| 6 | 0.070000 | 0.333333 | 0.657534 | 0.442396 | 0.414679 |
| 7 | 0.080000 | 0.366412 | 0.657534 | 0.470588 | 0.445164 |
| 8 | 0.090000 | 0.379032 | 0.643836 | 0.477157 | 0.452569 |
| 9 | 0.100000 | 0.394958 | 0.643836 | 0.489583 | 0.465964 |
| 10 | 0.019109 | 0.218430 | 0.876712 | 0.349727 | 0.310329 |
# plotting confusion matrix
y_test_pred_gb = [0 if x < 0.02 else 1 for x in y_test_prob_gb]
plot_cm(y_test_gb,y_test_pred_gb)
# adding final model to scorecard
score_card = score_card.append({'Model Name': 'Gradient Boosting',
'Threshold': 0.02,
'ROC-AUC': roc_auc_score(y_test_gb, y_test_prob_gb),
'Recall': metrics.recall_score(y_test_gb,y_test_pred_gb),
'Precision' : metrics.precision_score(y_test_gb,y_test_pred_gb),
'Kappa': metrics.cohen_kappa_score(y_test_gb,y_test_pred_gb)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
| 3 | Adaptive Boosting | 0.336667 | 0.939995 | 0.974359 | 0.116386 | 0.150043 |
| 4 | Gradient Boosting | 0.020000 | 0.952357 | 0.876712 | 0.226148 | 0.321035 |
# visualising feature importances
imp = pd.DataFrame(gb_tuned.feature_importances_,columns=['Importance'],index = x_gb.columns).sort_values(by='Importance',
ascending=False).head(10)
sns.barplot(y = imp.index, x = 'Importance',palette = 'Blues_d', data = imp)
plt.title('Comparing Feature Importance', fontsize = 20)
plt.show()
|
1. ROC-AUC for train and test after hyperparameter tuning are 0.9876 and 0.9523.
2. Recall at threshold obtained through youdens index(0.019109) is 0.876712. 3. There is a same recall value at threshold(0.02) but it has better values for precision, f1_score, kappa 4. We choose 0.02 as best threshold. 5. Net Value per share (B) is the most important feature. |
|
We will follow the following steps for XG Boost:-
1. Even though XG Boost can handle unclean data, we will proceed with cleaned data 2. Tune parameters like n_estimators,max_depth, min_samples_split, min_samples_leaf, max_features and max_leaf_nodes using RandomisedSearchCV 3. Build a model with the best hyperparameters 4. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. 5. Plot confusion matrix at best threshold 6. Add model with best threshold(recall maximum) to the score card 7. Look at feature importance |
# tuning parameters
xgb = XGBClassifier(objective = 'binary:logistic',random_state = 4, n_jobs= -1)
params = {'n_estimators': sp_randint(50,150),
'learning_rate': np.linspace(0,1,100),
'max_depth': sp_randint(2,6),
'gamma': sp_randint(1,50),
'reg_lambda': [0.001,0.1,1,10,100]}
rsearch_xgb = RandomizedSearchCV(estimator=xgb, param_distributions=params,n_iter = 500, cv = 3, n_jobs = -1,
scoring = 'roc_auc',random_state=42)
rsearch_xgb.fit(X,y)
[20:20:53] WARNING: /opt/concourse/worker/volumes/live/7a2b9f41-3287-451b-6691-43e9a6c0910f/volume/xgboost-split_1619728204606/work/src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
RandomizedSearchCV(cv=3,
estimator=XGBClassifier(base_score=None, booster=None,
colsample_bylevel=None,
colsample_bynode=None,
colsample_bytree=None, gamma=None,
gpu_id=None, importance_type='gain',
interaction_constraints=None,
learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None,
n_estimators=100,...
0.85858586, 0.86868687, 0.87878788, 0.88888889, 0.8989899 ,
0.90909091, 0.91919192, 0.92929293, 0.93939394, 0.94949495,
0.95959596, 0.96969697, 0.97979798, 0.98989899, 1. ]),
'max_depth': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd11627fd60>,
'n_estimators': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd11628bd30>,
'reg_lambda': [0.001, 0.1, 1, 10, 100]},
random_state=42, scoring='roc_auc')
# printing best params
rsearch_xgb.best_params_
{'gamma': 6,
'learning_rate': 0.18181818181818182,
'max_depth': 2,
'n_estimators': 118,
'reg_lambda': 1}
# storing best params
best_params = {'gamma': 6,
'learning_rate': 0.18181818181818182,
'max_depth': 2,
'n_estimators': 118,
'reg_lambda': 1}
# getting train-test roc scores on best params
xgb = XGBClassifier(**best_params,objective = 'binary:logistic',random_state = 4, n_jobs= -1)
xgb.fit(X_train,y_train)
y_train_prob_xgb = xgb.predict_proba(X_train)[:,1]
print('Trian-AUC:', roc_auc_score(y_train,y_train_prob_xgb))
y_test_prob_xgb = xgb.predict_proba(X_test)[:,1]
print('Test-AUC:', roc_auc_score(y_test,y_test_prob_xgb))
[17:15:22] WARNING: /opt/concourse/worker/volumes/live/7a2b9f41-3287-451b-6691-43e9a6c0910f/volume/xgboost-split_1619728204606/work/src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior. Trian-AUC: 0.9743826813178792 Test-AUC: 0.9501185636856369
# plotting roc curve
plot_roc(y_test,y_test_prob_xgb)
# creating threshold scorecard
thresh_score_xgb = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_xgb(model,thresh):
y_prob = model.predict_proba(X_test)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_xgb
thresh_score_xgb = thresh_score_xgb.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.1,0.9,9)
for i in values:
cal_score_xgb(xgb,i)
thresh_score_xgb
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.384615 | 0.705128 | 0.497738 | 0.471672 |
| 1 | 0.200000 | 0.522222 | 0.602564 | 0.559524 | 0.540766 |
| 2 | 0.300000 | 0.584615 | 0.487179 | 0.531469 | 0.514647 |
| 3 | 0.400000 | 0.565217 | 0.333333 | 0.419355 | 0.402453 |
| 4 | 0.500000 | 0.571429 | 0.256410 | 0.353982 | 0.338357 |
| 5 | 0.600000 | 0.588235 | 0.128205 | 0.210526 | 0.199606 |
| 6 | 0.700000 | 1.000000 | 0.051282 | 0.097561 | 0.094192 |
| 7 | 0.800000 | 1.000000 | 0.025641 | 0.050000 | 0.048186 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
# calling youdens_table
fpr_xgb, tpr_xgb, thresholds_xgb = roc_curve(y_test, y_test_prob_xgb)
youdens_table(tpr_xgb,fpr_xgb,thresholds_xgb).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.923077 | 0.157520 | 0.020784 | 0.765557 |
| 1 | 0.961538 | 0.201220 | 0.015142 | 0.760319 |
| 2 | 0.948718 | 0.191565 | 0.016621 | 0.757153 |
| 3 | 0.935897 | 0.179878 | 0.017934 | 0.756019 |
| 4 | 0.897436 | 0.141768 | 0.023485 | 0.755668 |
# adding thresh obtained from youdens index
cal_score_xgb(xgb,0.020784)
thresh_score_xgb
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.384615 | 0.705128 | 0.497738 | 0.471672 |
| 1 | 0.200000 | 0.522222 | 0.602564 | 0.559524 | 0.540766 |
| 2 | 0.300000 | 0.584615 | 0.487179 | 0.531469 | 0.514647 |
| 3 | 0.400000 | 0.565217 | 0.333333 | 0.419355 | 0.402453 |
| 4 | 0.500000 | 0.571429 | 0.256410 | 0.353982 | 0.338357 |
| 5 | 0.600000 | 0.588235 | 0.128205 | 0.210526 | 0.199606 |
| 6 | 0.700000 | 1.000000 | 0.051282 | 0.097561 | 0.094192 |
| 7 | 0.800000 | 1.000000 | 0.025641 | 0.050000 | 0.048186 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 9 | 0.020784 | 0.186352 | 0.910256 | 0.309368 | 0.262705 |
# adding another thresh obtained from youdens index
cal_score_xgb(xgb,0.015142)
thresh_score_xgb
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.100000 | 0.384615 | 0.705128 | 0.497738 | 0.471672 |
| 1 | 0.200000 | 0.522222 | 0.602564 | 0.559524 | 0.540766 |
| 2 | 0.300000 | 0.584615 | 0.487179 | 0.531469 | 0.514647 |
| 3 | 0.400000 | 0.565217 | 0.333333 | 0.419355 | 0.402453 |
| 4 | 0.500000 | 0.571429 | 0.256410 | 0.353982 | 0.338357 |
| 5 | 0.600000 | 0.588235 | 0.128205 | 0.210526 | 0.199606 |
| 6 | 0.700000 | 1.000000 | 0.051282 | 0.097561 | 0.094192 |
| 7 | 0.800000 | 1.000000 | 0.025641 | 0.050000 | 0.048186 |
| 8 | 0.900000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 9 | 0.020784 | 0.186352 | 0.910256 | 0.309368 | 0.262705 |
| 10 | 0.015142 | 0.159236 | 0.961538 | 0.273224 | 0.222356 |
# plotting confusion matrix
y_pred_xgb = [0 if x < 0.015142 else 1 for x in y_test_prob_xgb]
plot_cm(y_test,y_pred_xgb)
# adding final model to the scorecard
score_card = score_card.append({'Model Name': 'Extreme Gradient Boosting(XGBoost)',
'Threshold': 0.015142,
'ROC-AUC': roc_auc_score(y_test, y_test_prob_xgb),
'Recall': metrics.recall_score(y_test,y_pred_xgb),
'Precision' : metrics.precision_score(y_test,y_pred_xgb),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_xgb)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
| 3 | Adaptive Boosting | 0.336667 | 0.939995 | 0.974359 | 0.116386 | 0.150043 |
| 4 | Gradient Boosting | 0.020000 | 0.952357 | 0.876712 | 0.226148 | 0.321035 |
| 5 | Extreme Gradient Boosting(XGBoost) | 0.015142 | 0.950119 | 0.961538 | 0.159236 | 0.222356 |
# extracting feature importances
imp = pd.DataFrame(xgb.feature_importances_,columns=['Importance'],index = X.columns).sort_values(by='Importance',
ascending=False).head(10)
# visualising feature importances
sns.barplot(y = imp.index, x = 'Importance', data = imp,palette='Blues_d')
plt.title('Comparing Feature Importance', fontsize = 20)
plt.show()
|
1. ROC-AUC for train and test after hyperparameter tuning are 0.97438 and 0.950119.
2. Recall at threshold obtained through youdens index(0.020784 and 0.015142) is 0.910256 and 0.961538. 3. As the increase in recall is more than the decrease in precision we chose 0.015142 4. Degree of financial leverage and borrowing dependency are the most important features |
|
We will follow the following steps for K-Nearest Neighbors:-
1. As KNN is sensitive to unscaled data, we will proceed with cleaned data 2. Tune hyperparameters like n_neighbors and p 3. Build model with tuned hyperparameters 4. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. |
knn = KNeighborsClassifier(n_jobs = -1)
params = {'n_neighbors': sp_randint(5,47),
'p': sp_randint(1,10)}
rsearch_knn = RandomizedSearchCV(estimator=knn, param_distributions=params, cv = 3, n_iter = 50, scoring = 'roc_auc',
n_jobs = -1, random_state=42)
rsearch_knn.fit(X,y)
RandomizedSearchCV(cv=3, estimator=KNeighborsClassifier(n_jobs=-1), n_iter=50,
n_jobs=-1,
param_distributions={'n_neighbors': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe29840070>,
'p': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7ffe26ffd430>},
random_state=42, scoring='roc_auc')
rsearch_knn.best_params_
{'n_neighbors': 45, 'p': 1}
best_params = {'n_neighbors': 44, 'p': 3}
knn = KNeighborsClassifier(**rsearch_knn.best_params_,n_jobs = -1)
knn.fit(X_train,y_train)
y_prob_train = knn.predict_proba(X_train)[:,1]
print('Train ROC-AUC:', metrics.roc_auc_score(y_train,y_prob_train))
y_prob_test = knn.predict_proba(X_test)[:,1]
print('Test ROC-AUC:', metrics.roc_auc_score(y_test,y_prob_test))
Train ROC-AUC: 0.9575503115866436 Test ROC-AUC: 0.9437180790077132
# plotting roc curve
plot_roc(y_test,y_prob_test)
# creating threshold scorecard
thresh_score_knn = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_knn(model,thresh):
y_prob = model.predict_proba(X_test)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_knn
thresh_score_knn = thresh_score_knn.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.01,0.1,9)
for i in values:
cal_score_knn(knn,i)
thresh_score_knn
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.118565 | 0.974359 | 0.211405 | 0.153891 |
| 1 | 0.021250 | 0.118565 | 0.974359 | 0.211405 | 0.153891 |
| 2 | 0.032500 | 0.197222 | 0.910256 | 0.324201 | 0.279018 |
| 3 | 0.043750 | 0.197222 | 0.910256 | 0.324201 | 0.279018 |
| 4 | 0.055000 | 0.261719 | 0.858974 | 0.401198 | 0.364031 |
| 5 | 0.066250 | 0.261719 | 0.858974 | 0.401198 | 0.364031 |
| 6 | 0.077500 | 0.311224 | 0.782051 | 0.445255 | 0.413254 |
| 7 | 0.088750 | 0.311224 | 0.782051 | 0.445255 | 0.413254 |
| 8 | 0.100000 | 0.367347 | 0.692308 | 0.480000 | 0.452739 |
# calling youdens_table
fpr_knn, tpr_knn, thresholds_knn = roc_curve(y_test, y_prob_test)
youdens_table(tpr_knn,fpr_knn,thresholds_knn).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.910256 | 0.146850 | 0.044444 | 0.763407 |
| 1 | 0.858974 | 0.096037 | 0.066667 | 0.762938 |
| 2 | 0.782051 | 0.068598 | 0.088889 | 0.713454 |
| 3 | 0.974359 | 0.287093 | 0.022222 | 0.687265 |
| 4 | 0.692308 | 0.047256 | 0.111111 | 0.645052 |
# adding another thresh obtained from youdens index
cal_score_knn(knn,0.044444)
thresh_score_knn
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.010000 | 0.118565 | 0.974359 | 0.211405 | 0.153891 |
| 1 | 0.021250 | 0.118565 | 0.974359 | 0.211405 | 0.153891 |
| 2 | 0.032500 | 0.197222 | 0.910256 | 0.324201 | 0.279018 |
| 3 | 0.043750 | 0.197222 | 0.910256 | 0.324201 | 0.279018 |
| 4 | 0.055000 | 0.261719 | 0.858974 | 0.401198 | 0.364031 |
| 5 | 0.066250 | 0.261719 | 0.858974 | 0.401198 | 0.364031 |
| 6 | 0.077500 | 0.311224 | 0.782051 | 0.445255 | 0.413254 |
| 7 | 0.088750 | 0.311224 | 0.782051 | 0.445255 | 0.413254 |
| 8 | 0.100000 | 0.367347 | 0.692308 | 0.480000 | 0.452739 |
| 9 | 0.044444 | 0.197222 | 0.910256 | 0.324201 | 0.279018 |
# plotting confusion matrix
y_pred_knn = [0 if x < 0.044444 else 1 for x in y_prob_test]
plot_cm(y_test,y_pred_knn)
# adding final model to the scorecard
score_card = score_card.append({'Model Name': 'K-Nearest Neighbors',
'Threshold': 0.044444,
'ROC-AUC': roc_auc_score(y_test, y_prob_test),
'Recall': metrics.recall_score(y_test,y_pred_knn),
'Precision' : metrics.precision_score(y_test,y_pred_knn),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_knn)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
| 3 | Adaptive Boosting | 0.336667 | 0.939995 | 0.974359 | 0.116386 | 0.150043 |
| 4 | Gradient Boosting | 0.020000 | 0.952357 | 0.876712 | 0.226148 | 0.321035 |
| 5 | Extreme Gradient Boosting(XGBoost) | 0.015142 | 0.950119 | 0.961538 | 0.159236 | 0.222356 |
| 6 | K-Nearest Neighbors | 0.044444 | 0.936568 | 0.910256 | 0.197222 | 0.279018 |
|
1. ROC-AUC for train and test after hyperparameter tuning are 0.97438 and 0.950119.
2. Recall at threshold obtained through youdens index(0.020784 and 0.015142) is 0.910256 and 0.961538. 3. As the increase in recall is more than the decrease in precision we chose 0.015142 4. Degree of financial leverage and borrowing dependency are the most important features |
|
We will follow the following steps for Naive Bayes:-
1. As Naive Bayes sensitive to uncleandata, we will proceed with cleaned data 2. Build Gaussian Naive bayes 3. Build scorecard to see how different metrics like Precision, Recall, F1-score and kappa are performing under different thresholds. |
# building Gaussian Naive bayes and checking train & test scores
gnb = GaussianNB()
gnb.fit(X_train,y_train)
y_prob_train = gnb.predict_proba(X_train)[:,1]
print('Train ROC-AUC:', metrics.roc_auc_score(y_train,y_prob_train))
y_prob_test = gnb.predict_proba(X_test)[:,1]
print('Test ROC-AUC:', metrics.roc_auc_score(y_test,y_prob_test))
Train ROC-AUC: 0.9207985985444084 Test ROC-AUC: 0.9365684281842819
# plotting roc_auc curve
plot_roc(y_test,y_test_prob)
# creating threshold scorecard
thresh_score_gnb = pd.DataFrame(columns=['Threshold','Precision','Recall','F1-score','Kappa'])
def cal_score_gnb(model,thresh):
y_prob = model.predict_proba(X_test)[:,1]
y_pred = [0 if x<thresh else 1 for x in y_prob]
global thresh_score_gnb
thresh_score_gnb = thresh_score_gnb.append({'Threshold': thresh,
'Precision': metrics.precision_score(y_test, y_pred),
'Recall': metrics.recall_score(y_test, y_pred),
'F1-score': metrics.f1_score(y_test, y_pred),
'Kappa':metrics.cohen_kappa_score(y_test, y_pred)},
ignore_index = True)
# checking metrics at different thresholds
values = np.linspace(0.3,0.4,9)
for i in values:
cal_score_gnb(gnb,i)
thresh_score_gnb
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.300000 | 0.202279 | 0.910256 | 0.331002 | 0.286491 |
| 1 | 0.312500 | 0.202857 | 0.910256 | 0.331776 | 0.287341 |
| 2 | 0.325000 | 0.204023 | 0.910256 | 0.333333 | 0.289051 |
| 3 | 0.337500 | 0.204023 | 0.910256 | 0.333333 | 0.289051 |
| 4 | 0.350000 | 0.205202 | 0.910256 | 0.334906 | 0.290778 |
| 5 | 0.362500 | 0.201754 | 0.884615 | 0.328571 | 0.284125 |
| 6 | 0.375000 | 0.202346 | 0.884615 | 0.329356 | 0.284987 |
| 7 | 0.387500 | 0.201183 | 0.871795 | 0.326923 | 0.282472 |
| 8 | 0.400000 | 0.201780 | 0.871795 | 0.327711 | 0.283338 |
# calling youdens_table
fpr_gnb, tpr_gnb, thresholds_gnb = roc_curve(y_test, y_test_prob_xgb)
youdens_table(tpr_gnb,fpr_gnb,thresholds_gnb).head()
| TPR | FPR | Threshold | Difference | |
|---|---|---|---|---|
| 0 | 0.923077 | 0.157520 | 0.020784 | 0.765557 |
| 1 | 0.961538 | 0.201220 | 0.015142 | 0.760319 |
| 2 | 0.948718 | 0.191565 | 0.016621 | 0.757153 |
| 3 | 0.935897 | 0.179878 | 0.017934 | 0.756019 |
| 4 | 0.897436 | 0.141768 | 0.023485 | 0.755668 |
# adding thresh from youdens index to scorecard
cal_score_gnb(gnb,0.020784)
thresh_score_gnb
| Threshold | Precision | Recall | F1-score | Kappa | |
|---|---|---|---|---|---|
| 0 | 0.300000 | 0.202279 | 0.910256 | 0.331002 | 0.286491 |
| 1 | 0.312500 | 0.202857 | 0.910256 | 0.331776 | 0.287341 |
| 2 | 0.325000 | 0.204023 | 0.910256 | 0.333333 | 0.289051 |
| 3 | 0.337500 | 0.204023 | 0.910256 | 0.333333 | 0.289051 |
| 4 | 0.350000 | 0.205202 | 0.910256 | 0.334906 | 0.290778 |
| 5 | 0.362500 | 0.201754 | 0.884615 | 0.328571 | 0.284125 |
| 6 | 0.375000 | 0.202346 | 0.884615 | 0.329356 | 0.284987 |
| 7 | 0.387500 | 0.201183 | 0.871795 | 0.326923 | 0.282472 |
| 8 | 0.400000 | 0.201780 | 0.871795 | 0.327711 | 0.283338 |
| 9 | 0.020784 | 0.159193 | 0.910256 | 0.270992 | 0.220399 |
# plotting confusion matrix at best thresh
y_pred_gnb = [0 if x < 0.350000 else 1 for x in y_prob_test]
plot_cm(y_test,y_pred_gnb)
# adding final model to the scorecard
score_card = score_card.append({'Model Name': 'Gaussian Naive Bayes',
'Threshold': 0.350000,
'ROC-AUC': roc_auc_score(y_test, y_prob_test),
'Recall': metrics.recall_score(y_test,y_pred_gnb),
'Precision' : metrics.precision_score(y_test,y_pred_gnb),
'Kappa': metrics.cohen_kappa_score(y_test,y_pred_gnb)}, ignore_index = True)
score_card
| Model Name | Threshold | ROC-AUC | Recall | Precision | Kappa | |
|---|---|---|---|---|---|---|
| 0 | Logistic Regression with RFE | 0.020815 | 0.959311 | 0.974359 | 0.194872 | 0.278973 |
| 1 | Decision Tree | 0.074830 | 0.906276 | 0.756410 | 0.317204 | 0.415575 |
| 2 | Random Forest | 0.050021 | 0.946236 | 0.897436 | 0.201729 | 0.284894 |
| 3 | Adaptive Boosting | 0.336667 | 0.939995 | 0.974359 | 0.116386 | 0.150043 |
| 4 | Gradient Boosting | 0.020000 | 0.952357 | 0.876712 | 0.226148 | 0.321035 |
| 5 | Extreme Gradient Boosting(XGBoost) | 0.015142 | 0.950119 | 0.961538 | 0.159236 | 0.222356 |
| 6 | K-Nearest Neighbors | 0.044444 | 0.936568 | 0.910256 | 0.197222 | 0.279018 |
| 7 | Gaussian Naive Bayes | 0.350000 | 0.936568 | 0.910256 | 0.205202 | 0.290778 |
|
1. ROC-AUC for train and test after hyperparameter tuning are 0.92 and 0.93.
2. Recall at best threshold (0.35) is 0.910256 with an improvement at precision at 0.205202 . |
|
We will follow the following steps for analysis and interpretation of the models:-
1. Visualise metrics Recall,precision and kappa for all the models and compare 2. Pick the best model, visualise and interpret coefficeints 3. Find features are most important in all the tree based models |
#creating subplots
ax=plt.subplots()
#plotting columns
ax=sns.barplot(x=score_card['Recall'],y=score_card['Model Name'],color = 'lightblue', label = 'Recall')
ax=sns.barplot(x=score_card['Kappa'],y=score_card['Model Name'],color = '#03befc', label = 'Kappa')
ax=sns.barplot(x=score_card['Precision'],y=score_card['Model Name'],color = '#59a0c9', label = 'Precision')
#renaming the axes
ax.set(xlabel="Recall, Precision and Kappa", ylabel="Models")
plt.legend(bbox_to_anchor = (1,1.19))
# visulaizing illustration
plt.show()
|
As RFE logit model gives the best recall with a better precision, we will choose this model.
|
cols = cols.to_list()
cols_2 = cols.copy()
# making dict for param df
params = {'feature': cols_2,
'pvals':model_logit_rfe.pvalues,
'coeffs': model_logit_rfe.params}
# making paramter df
param_df = pd.DataFrame(params).reset_index(drop=True)
param_df.head()
| feature | pvals | coeffs | |
|---|---|---|---|
| 0 | const | 0.419434 | 0.941714 |
| 1 | Non-industry income and expenditure/revenue | 0.000023 | -3.115833 |
| 2 | Cash flow rate | 0.655329 | -0.584268 |
| 3 | Interest-bearing debt interest rate | 0.000060 | 2.218588 |
| 4 | Tax rate (A) | 0.513099 | -0.495925 |
# printing significance and impact
for i in range(1,27):
if param_df.iloc[i,1] > 0.05:
print('*',param_df.iloc[i,0],'is not statistically significant')
print()
else:
if param_df.iloc[i,2] > 0:
print('As%s increases probability of bankruptcy increases'%(param_df.iloc[i,0]))
print()
else:
print('As%s increases probability of bankruptcy decreases'%(param_df.iloc[i,0]))
print()
As Non-industry income and expenditure/revenue increases probability of bankruptcy decreases * Cash flow rate is not statistically significant As Interest-bearing debt interest rate increases probability of bankruptcy increases * Tax rate (A) is not statistically significant As Net Value Per Share (B) increases probability of bankruptcy decreases * After-tax Net Profit Growth Rate is not statistically significant As Continuous Net Profit Growth Rate increases probability of bankruptcy decreases As Total Asset Growth Rate increases probability of bankruptcy decreases * Net Value Growth Rate is not statistically significant * Total Asset Return Growth Rate Ratio is not statistically significant As Cash Reinvestment % increases probability of bankruptcy decreases * Quick Ratio is not statistically significant * Long-term fund suitability ratio (A) is not statistically significant As Borrowing dependency increases probability of bankruptcy increases As Accounts Receivable Turnover increases probability of bankruptcy decreases * Fixed Assets Turnover Frequency is not statistically significant As Revenue per person increases probability of bankruptcy increases * Cash/Total Assets is not statistically significant As Inventory/Working Capital increases probability of bankruptcy decreases * Retained Earnings to Total Assets is not statistically significant As Total expense/Assets increases probability of bankruptcy increases As Current Asset Turnover Rate increases probability of bankruptcy increases * Cash Flow to Sales is not statistically significant * No-credit Interval is not statistically significant * Gross Profit to Sales is not statistically significant * Degree of Financial Leverage (DFL) is not statistically significant
# visualising coefficients and their significance
colors = ['lightblue' if x < 0.05 else '#ed5858' for x in param_df['pvals']]
plt.figure(figsize = (20,15))
sns.barplot(param_df['coeffs'],param_df['feature'], palette = colors)
plt.show()
|
The red bars mean the feature is not statistically significant.
|
# making composite df for top 5 most important features for all tree based models
imp_all = pd.DataFrame({'Feat':X_train_dtc.columns,'Importance':dtc.feature_importances_}).sort_values(by='Importance',
ascending=False).head().reset_index(drop=True)
for i in [rfc,adac,gb_tuned,xgb]:
imp = pd.DataFrame({'Feat':X_train.columns,'Importance':i.feature_importances_}).sort_values(by='Importance',
ascending=False).head().reset_index(drop=True)
imp_all = pd.concat([imp_all,imp])
imp_all.shape
(25, 2)
# visualising important features
sns.countplot(y = imp_all['Feat'], palette = 'Blues_d')
plt.show()
|
Net value per share and borrowing dependency are the most important features
|
|
Lets analyse the common false positives
|
# making df for false positives
fp_duplicates = pd.DataFrame()
tn_duplicates = pd.DataFrame()
for i in [y_pred_rfe,y_pred_ada,y_pred_gnb,y_pred_knn,y_pred_rfc,y_pred_xgb]:
test = X_test.copy()
test['pred'] = i
test['actual'] = y_test
fp = test[(test['pred']==1) & (test['actual']==0)]
tn = test[(test['pred']==0) & (test['actual']==0)]
fp = fp.drop(['pred','actual'],axis=1)
tn = tn.drop(['pred','actual'],axis=1)
fp_duplicates = pd.concat([fp_duplicates,fp])
tn_duplicates = pd.concat([tn_duplicates,tn])
#checking shape
print(fp_duplicates.shape)
print(tn_duplicates.shape)
(2128, 47) (9680, 47)
# checking total duplicates
print('Common FP:',fp_duplicates.duplicated(keep = False).sum())
print('Common TN:',tn_duplicates.duplicated(keep = False).sum())
Common FP: 1962 Common TN: 9627
# filtering out common false positives
fp_duplicates = fp_duplicates[fp_duplicates.duplicated(keep = False)]
# filtering out common true negatives
tn_duplicates = tn_duplicates[tn_duplicates.duplicated(keep = False)]
print(fp_duplicates.shape)
print(tn_duplicates.shape)
(1962, 47) (9627, 47)
# dropping duplicates
fp_duplicates = fp_duplicates.drop_duplicates()
tn_duplicates = tn_duplicates.drop_duplicates()
print(fp_duplicates.shape)
print(tn_duplicates.shape)
(481, 47) (1765, 47)
# summary stats on misclassifications
fp_duplicates.describe()
| Non-industry income and expenditure/revenue | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Quick Ratio | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Revenue per person | Operating profit per person | Allocation rate per person | Cash/Total Assets | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Retained Earnings to Total Assets | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Equity to Long-term Liability | Cash Flow to Liability | Cash Flow to Equity | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Degree of Financial Leverage (DFL) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 | 481.000000 |
| mean | 0.371778 | 0.213736 | 0.260305 | 0.375029 | 0.417811 | 0.089226 | 0.345068 | 0.391936 | 0.277659 | 0.412487 | 0.423530 | 0.415439 | 0.417790 | 0.585875 | 0.356151 | 0.417738 | 0.416915 | 0.220432 | 0.343584 | 0.487336 | 0.131654 | 0.384431 | 0.451866 | 0.334874 | 0.217593 | 0.512308 | 0.372047 | 0.565475 | 0.370682 | 0.543766 | 0.312139 | 0.693966 | 0.437282 | 0.310851 | 0.403161 | 0.488734 | 0.286162 | 0.253194 | 0.456178 | 0.369980 | 0.331459 | 0.447429 | 0.457486 | 0.456426 | 0.417912 | 0.405428 | 0.411477 |
| std | 0.154196 | 0.350955 | 0.318646 | 0.121534 | 0.199851 | 0.193776 | 0.148257 | 0.149024 | 0.225427 | 0.196454 | 0.179432 | 0.183218 | 0.182117 | 0.147384 | 0.159110 | 0.188697 | 0.169212 | 0.163540 | 0.163777 | 0.255438 | 0.230894 | 0.184920 | 0.235306 | 0.363072 | 0.149696 | 0.211627 | 0.175473 | 0.218492 | 0.180017 | 0.226959 | 0.243186 | 0.241038 | 0.197070 | 0.132807 | 0.221794 | 0.153110 | 0.390408 | 0.277715 | 0.152868 | 0.262363 | 0.297375 | 0.134359 | 0.171296 | 0.230890 | 0.172493 | 0.159944 | 0.245581 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.010813 | 0.000000 | 0.000000 | 0.014268 | 0.008137 | 0.002292 | 0.011376 | 0.005584 | 0.000479 | 0.000890 | 0.084198 | 0.003762 | 0.011732 | 0.001555 | 0.004299 | 0.165934 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.091990 | 0.054282 | 0.020482 | 0.066256 | 0.037781 | 0.018940 | 0.000000 | 0.052952 | 0.024421 | 0.000000 | 0.040874 | 0.332187 | 0.000000 | 0.000000 | 0.003831 | 0.001251 | 0.000000 | 0.014920 | 0.010934 | 0.095868 | 0.006490 | 0.022859 | 0.000000 |
| 25% | 0.275318 | 0.000000 | 0.000000 | 0.303635 | 0.263158 | 0.000000 | 0.251936 | 0.309194 | 0.116096 | 0.278060 | 0.305946 | 0.302368 | 0.304222 | 0.501730 | 0.251274 | 0.280025 | 0.319207 | 0.116457 | 0.225635 | 0.294435 | 0.000000 | 0.235918 | 0.270568 | 0.000000 | 0.129976 | 0.342789 | 0.259940 | 0.409802 | 0.238349 | 0.388743 | 0.132126 | 0.534219 | 0.291961 | 0.220490 | 0.230524 | 0.370253 | 0.000000 | 0.000000 | 0.382366 | 0.150353 | 0.003931 | 0.397563 | 0.370297 | 0.261710 | 0.307428 | 0.299766 | 0.206398 |
| 50% | 0.378626 | 0.000000 | 0.054421 | 0.383037 | 0.378168 | 0.000000 | 0.343661 | 0.393002 | 0.206976 | 0.402525 | 0.422101 | 0.412461 | 0.422124 | 0.574394 | 0.354509 | 0.412473 | 0.424767 | 0.193689 | 0.283451 | 0.521313 | 0.000000 | 0.349684 | 0.432952 | 0.185565 | 0.149094 | 0.477410 | 0.351205 | 0.562018 | 0.337005 | 0.536058 | 0.256588 | 0.732479 | 0.424469 | 0.315444 | 0.361947 | 0.440564 | 0.000000 | 0.161000 | 0.457382 | 0.319056 | 0.287742 | 0.450392 | 0.457506 | 0.408715 | 0.412364 | 0.383455 | 0.359223 |
| 75% | 0.467176 | 0.500250 | 0.490325 | 0.446269 | 0.573099 | 0.002757 | 0.423155 | 0.465419 | 0.365351 | 0.526722 | 0.539983 | 0.536188 | 0.535590 | 0.673587 | 0.437278 | 0.516209 | 0.513608 | 0.273043 | 0.414879 | 0.676512 | 0.194489 | 0.490506 | 0.624854 | 0.701065 | 0.244576 | 0.661065 | 0.453313 | 0.732362 | 0.479739 | 0.712170 | 0.426574 | 0.897555 | 0.582141 | 0.408691 | 0.538986 | 0.549869 | 0.730753 | 0.403000 | 0.526176 | 0.563640 | 0.575210 | 0.501703 | 0.535286 | 0.638054 | 0.508031 | 0.473269 | 0.635209 |
| max | 0.991858 | 0.998998 | 0.994760 | 0.869577 | 0.893762 | 0.992320 | 0.913575 | 0.977217 | 0.988701 | 0.986215 | 0.999318 | 0.987883 | 0.990721 | 0.989619 | 0.997420 | 0.998456 | 0.982115 | 0.972372 | 0.971134 | 1.000000 | 0.978781 | 0.993671 | 0.980674 | 0.997494 | 0.902266 | 0.989533 | 0.999699 | 0.998111 | 0.995524 | 0.997058 | 0.989795 | 1.000000 | 0.985481 | 0.738459 | 1.000000 | 0.980141 | 0.998999 | 0.989000 | 0.972692 | 0.999185 | 0.994866 | 0.977011 | 1.000000 | 0.997235 | 1.000000 | 0.990249 | 0.992531 |
tn_duplicates.describe()
| Non-industry income and expenditure/revenue | Operating Expense Rate | Research and development expense rate | Cash flow rate | Interest-bearing debt interest rate | Tax rate (A) | Net Value Per Share (B) | Cash Flow Per Share | Revenue Per Share (Yuan ¥) | Realized Sales Gross Profit Growth Rate | Operating Profit Growth Rate | After-tax Net Profit Growth Rate | Continuous Net Profit Growth Rate | Total Asset Growth Rate | Net Value Growth Rate | Total Asset Return Growth Rate Ratio | Cash Reinvestment % | Quick Ratio | Long-term fund suitability ratio (A) | Borrowing dependency | Contingent liabilities/Net worth | Accounts Receivable Turnover | Average Collection Days | Inventory Turnover Rate (times) | Fixed Assets Turnover Frequency | Revenue per person | Operating profit per person | Allocation rate per person | Cash/Total Assets | Inventory/Working Capital | Inventory/Current Liability | Current Liabilities/Liability | Working Capital/Equity | Retained Earnings to Total Assets | Total expense/Assets | Current Asset Turnover Rate | Quick Asset Turnover Rate | Cash Turnover Rate | Cash Flow to Sales | Fixed Assets to Assets | Equity to Long-term Liability | Cash Flow to Liability | Cash Flow to Equity | Total assets to GNP price | No-credit Interval | Gross Profit to Sales | Degree of Financial Leverage (DFL) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 | 1765.000000 |
| mean | 0.535159 | 0.249318 | 0.316801 | 0.518263 | 0.301301 | 0.229861 | 0.523715 | 0.517056 | 0.366804 | 0.492831 | 0.497460 | 0.506130 | 0.505634 | 0.662087 | 0.502025 | 0.505343 | 0.506773 | 0.400607 | 0.418151 | 0.209064 | 0.138017 | 0.414949 | 0.417028 | 0.281285 | 0.264231 | 0.528142 | 0.519065 | 0.491264 | 0.509911 | 0.520119 | 0.336458 | 0.721183 | 0.525647 | 0.533477 | 0.373511 | 0.449677 | 0.262977 | 0.241072 | 0.505406 | 0.292847 | 0.202263 | 0.501595 | 0.502286 | 0.436810 | 0.509751 | 0.520312 | 0.491837 |
| std | 0.167349 | 0.373990 | 0.321259 | 0.189458 | 0.213570 | 0.220096 | 0.173741 | 0.186796 | 0.234133 | 0.164622 | 0.163174 | 0.163015 | 0.165368 | 0.137466 | 0.164325 | 0.176785 | 0.182723 | 0.225064 | 0.184051 | 0.215644 | 0.245070 | 0.174715 | 0.197843 | 0.378984 | 0.180245 | 0.192567 | 0.179307 | 0.204594 | 0.226508 | 0.160914 | 0.237833 | 0.240550 | 0.158084 | 0.177991 | 0.212591 | 0.114609 | 0.382667 | 0.297664 | 0.187344 | 0.230534 | 0.265798 | 0.190177 | 0.189258 | 0.201959 | 0.170472 | 0.172982 | 0.169111 |
| min | 0.003562 | 0.000000 | 0.000000 | 0.005918 | 0.000000 | 0.000000 | 0.014268 | 0.000000 | 0.000000 | 0.009770 | 0.003346 | 0.014242 | 0.000890 | 0.114187 | 0.010104 | 0.002470 | 0.000000 | 0.005255 | 0.172392 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.027410 | 0.059039 | 0.000000 | 0.018940 | 0.000000 | 0.011337 | 0.025195 | 0.003343 | 0.000000 | 0.295816 | 0.000000 | 0.000000 | 0.000000 | 0.000128 | 0.000000 | 0.004014 | 0.000000 | 0.095868 | 0.000411 | 0.077297 | 0.000000 |
| 25% | 0.442239 | 0.000000 | 0.000000 | 0.387710 | 0.167641 | 0.000000 | 0.402772 | 0.395443 | 0.181759 | 0.390658 | 0.404156 | 0.406156 | 0.407107 | 0.575548 | 0.395464 | 0.395184 | 0.397551 | 0.233207 | 0.269357 | 0.021833 | 0.000000 | 0.282437 | 0.271162 | 0.000000 | 0.140588 | 0.377083 | 0.400602 | 0.350414 | 0.331412 | 0.398674 | 0.149237 | 0.552263 | 0.409049 | 0.428846 | 0.214891 | 0.366674 | 0.000000 | 0.000000 | 0.407343 | 0.107452 | 0.000000 | 0.397402 | 0.400363 | 0.270447 | 0.413698 | 0.387976 | 0.375202 |
| 50% | 0.517557 | 0.000000 | 0.255031 | 0.492324 | 0.277778 | 0.199882 | 0.499796 | 0.502848 | 0.305060 | 0.478676 | 0.485017 | 0.492372 | 0.491789 | 0.656286 | 0.473245 | 0.497684 | 0.503694 | 0.343056 | 0.362393 | 0.147838 | 0.000000 | 0.381329 | 0.403099 | 0.000000 | 0.193632 | 0.496103 | 0.487048 | 0.479699 | 0.486355 | 0.483433 | 0.300625 | 0.775025 | 0.519153 | 0.523095 | 0.323812 | 0.415788 | 0.000000 | 0.084200 | 0.487868 | 0.234699 | 0.053941 | 0.481087 | 0.491121 | 0.403945 | 0.503656 | 0.492678 | 0.429839 |
| 75% | 0.622392 | 0.642972 | 0.591264 | 0.637254 | 0.397856 | 0.413942 | 0.631879 | 0.633849 | 0.524316 | 0.581995 | 0.590324 | 0.596486 | 0.598730 | 0.754325 | 0.596259 | 0.604816 | 0.625000 | 0.544468 | 0.536650 | 0.329431 | 0.192598 | 0.524525 | 0.546696 | 0.703914 | 0.322543 | 0.665454 | 0.631928 | 0.625659 | 0.694187 | 0.614527 | 0.486797 | 0.937310 | 0.636058 | 0.644735 | 0.500636 | 0.490379 | 0.691375 | 0.446000 | 0.613538 | 0.433349 | 0.349836 | 0.615767 | 0.610695 | 0.575751 | 0.608142 | 0.631001 | 0.599676 |
| max | 0.996947 | 0.999499 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 0.999185 | 1.000000 | 0.999509 | 0.997636 | 0.999318 | 0.998201 | 0.997408 | 0.998847 | 1.000000 | 0.999074 | 0.999222 | 0.997930 | 0.999678 | 0.989691 | 1.000000 | 0.993671 | 0.987908 | 1.000000 | 1.000000 | 0.998426 | 1.000000 | 0.998111 | 0.997822 | 0.997058 | 0.999444 | 1.000000 | 0.999447 | 0.996544 | 1.000000 | 0.992159 | 0.999500 | 1.000000 | 0.997804 | 1.000000 | 0.999649 | 0.995158 | 1.000000 | 0.999670 | 1.000000 | 0.995227 | 0.991245 |
# plot to compare medians of true negatives vs false positives
plt.figure(figsize = (25,20))
plt.plot(fp_duplicates.median(),fp_duplicates.columns,'r-o',markersize = 15,
label = 'False Positives')
plt.plot(tn_duplicates.median(),tn_duplicates.columns,'b-o',markersize = 15,
label = 'True Negatives')
plt.legend()
plt.show()
# visualising boxplots for misclassifications
plt.figure(figsize=(20,15))
sns.boxplot(data = fp_duplicates, orient = 'h', palette='Blues_d')
plt.show()
# preparing data to compare rfe misclassifications
train = X_train[cols].copy()
test = X_test[cols].copy()
test['pred'] = y_pred_rfe
test['actual'] = y_test
# filtering out misclassifications for logit_rfe
fn = test[(test['pred']==0) & (test['actual']==1)]
fp = test[(test['pred']==1) & (test['actual']==0)]
tn = test[(test['pred']==0) & (test['actual']==0)]
# printing shape
fp.shape
(314, 28)
# plot to compare medians of true negatives vs false positives
plt.figure(figsize = (20,15))
plt.plot(fp.drop(['pred','actual'],axis=1).median(),fp.drop(['pred','actual'],axis=1).columns,'r-o',markersize = 15,
label = 'False Positives')
plt.plot(tn.drop(['pred','actual'],axis=1).median(),fp.drop(['pred','actual'],axis=1).columns,'b-o',markersize = 15,
label = 'True Negatives')
plt.legend()
plt.show()
|
We can say that in the range in which these financial ratios fall together when misclassified is a grey area
|